Creating a PDF with an image in iTextSharp

This article shows how to create a PDF file containing an image in iTextSharp. It covers the following features:

  • Image is stored as an embedded resource in a .NET class library
  • Content of the PDF is created from HTML

The iTextSharp library provides a way to create a PDF from HTML. But when the PDF should contain images that are not accessible via a public URL some adjustments are necessary. To demonstrate that I decided to embed the image to a .NET class library. This is just an example. The image could also be saved on a filesystem or even be stored as a base64 encoded string in a SQL database.

To be as close as possible to the HTML specification I use the img-tag to reference the image. The HTML markup is created via a XSL transformation.

Content of the xslt:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                              xmlns:msxsl="urn:schemas-microsoft-com:xslt"
                              exclude-result-prefixes="msxsl">
    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="/">
      <html>
        <head>
          <title></title>
        </head>
        <body>
          <div style="min-width: 1237px;max-width: 1270px;position: relative;margin: 0 auto;">
            <img src="data:imagestream/EmbeddedImage.Resources.image.jpg" />
          </div>
          <div>It´s working!!</div>
        </body>
      </html>
    </xsl:template>
</xsl:stylesheet>

The src-attribute doesn´t contain a http-Url. Instead I use the prefix data:imagestream to identify the source type of the image. After the following slash the name of the resource in the manifest of the .NET library is listed. That´s the location where differenty source types could be defined.

Now I have to teach iTextSharp a different handling for the img-tag.

This method takes the HTML content and converts it to PDF.

public Stream CreateFromHtml(string html)
{
	var stream = new MemoryStream();

	using (var doc = new Document(PageSize.A4))
	{
		using (var ms = new MemoryStream())
		{
			using (var writer = PdfWriter.GetInstance(doc, ms))
			{
				writer.CloseStream = false;
				doc.Open();

				var tagProcessors = (DefaultTagProcessorFactory)Tags.GetHtmlTagProcessorFactory();
				tagProcessors.RemoveProcessor(HTML.Tag.IMG);
				tagProcessors.AddProcessor(HTML.Tag.IMG, new CustomImageTagProcessor()); 

				var cssFiles = new CssFilesImpl();
				cssFiles.Add(XMLWorkerHelper.GetInstance().GetDefaultCSS());
				var cssResolver = new StyleAttrCSSResolver(cssFiles);

				var charset = Encoding.UTF8;
				var context = new HtmlPipelineContext(new CssAppliersImpl(new XMLWorkerFontProvider()));
				context.SetAcceptUnknown(true).AutoBookmark(true).SetTagFactory(tagProcessors);
				var htmlPipeline = new HtmlPipeline(context, new PdfWriterPipeline(doc, writer));
				var cssPipeline = new CssResolverPipeline(cssResolver, htmlPipeline);
				var worker = new XMLWorker(cssPipeline, true);
				var xmlParser = new XMLParser(true, worker, charset);

				using (var sr = new StringReader(html))
				{
					xmlParser.Parse(sr);
					doc.Close();
					ms.Position = 0;
					ms.CopyTo(stream);
					stream.Position = 0;
				}
			}
		}
	}

	return stream;
}

Here is where the magic happens. The default handling for img-tags is replaced by a custom tag processor.

var tagProcessors = (DefaultTagProcessorFactory)Tags.GetHtmlTagProcessorFactory();
tagProcessors.RemoveProcessor(HTML.Tag.IMG);
tagProcessors.AddProcessor(HTML.Tag.IMG, new CustomImageTagProcessor());

The custom tag processor only steps in with custom logic, when the src-tag starts with data:imagestream. That ensures that all other “legal” images are resolved by the logic of the default processor.

public class CustomImageTagProcessor : iTextSharp.tool.xml.html.Image
{
	public override IList<IElement> End(IWorkerContext ctx, Tag tag, IList<IElement> currentContent)
	{
		var src = String.Empty;

		if (!tag.Attributes.TryGetValue(HTML.Attribute.SRC, out src))
			return new List<IElement>(1);

		if (String.IsNullOrWhiteSpace(src))
			return new List<IElement>(1);

		if (src.StartsWith("data:imagestream/", StringComparison.InvariantCultureIgnoreCase))
		{
			var name = src.Substring(src.IndexOf("/", StringComparison.InvariantCultureIgnoreCase) + 1);

			using (var stream = Assembly.GetExecutingAssembly().GetManifestResourceStream(name))
			{
				return CreateElementList(ctx, tag, Image.GetInstance(stream));
			}
		}

		return base.End(ctx, tag, currentContent);
	}

	protected IList<IElement> CreateElementList(IWorkerContext ctx, Tag tag, Image image)
	{
		var htmlPipelineContext = GetHtmlPipelineContext(ctx);
		var result = new List<IElement>();
		var element = GetCssAppliers().Apply(new Chunk((Image) GetCssAppliers().Apply(image, tag, htmlPipelineContext), 0, 0, true), tag, htmlPipelineContext);
		result.Add(element);

		return result;
	}
}

The following sample code creates the HTML, converts it to the PDF and saves the result to a file.

class Program
{
	static void Main(string[] args)
	{
		string html;

		using (var xsltStream = Assembly.GetExecutingAssembly().GetManifestResourceStream("EmbeddedImage.Resources.Pdf.xslt"))
		{
			var reader = XmlReader.Create(xsltStream);
			var xslt = new XslCompiledTransform();
			xslt.Load(reader);

			using (var ms = new MemoryStream())
			{
				xslt.Transform(new XmlDocument(), new XsltArgumentList(), ms);
				ms.Position = 0;
				using (var r = new StreamReader(ms))
				{
					html = r.ReadToEnd();
				}
			}
		}

		var pdfService = new TextSharpPdfService();
		using (var pdf = pdfService.CreateFromHtml(html))
		{
			using (var fs = new FileStream(@"c:\temp\embeddedimage.pdf", FileMode.Create))
			{
				pdf.CopyTo(fs);
			}
		}
	}
}

And here´s the result:

embedded_image_pdf

In my next post I will show how to use the same embedded resource image as header for a HTML mail.

Advertisements

Invalid certificate thumbprint in Windows Azure WebRole/WorkerRole

A communication via SSL is nowadays more important than ever. Hence at Tekaris I had to secure a website hosted in a Windows Azure Webrole with a SSL certificate.

I´ll demonstrate the way I did it based on a demo project. Additionally to uploading the certificate(s) to the Windows Azure Portal you have to add the certificates thumbprints to the appropriate WebRole or WorkerRole.

I opened the properties of the WebRole

Webrole_Properties

and clicked on “Add Certificate”

WebRole_Add_Certificate

Then it´s necessary to insert the thumbprint of the certificate. Microsoft provides instructions how to get that. I take the GlobalSign Root CA certificate as example.

Certificate_Details

As described in the instructions I copied the hexadecimal value of the thumbprint and inserted it into Visual Studio.

Webrole_Properties_With_Thumbprint

That´s it. Save – Build – Build failed… Well, why isn´t it surprising me?

I got the following errormessage:

The XML specification is not valid: The ‘thumbprint’ attribute is invalid – The value ‘‎b1 bc 96 8b d4 f4 9d 62 2a a8 9a 81 f2 15 01 52 a4 1d 82 9c’ is invalid according to its datatype ‘http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration:ThumbprintType&#8217; – The Pattern constraint failed.

Invalid_Thumbprint

My first suspicion was that I eventually have to remove the blanks but that didn´t solve it. So after counting the correct amount of hexadecimal characters (40) I noticed that I had to press the right cursor button twice at the beginning of the string in order to get past the first character. 

Indeed there was an “invisible” character at the first position that caused that error. The character has the decimal value 8206 and represents a “left-to-right mark“. After removing it everything worked like a charm.