<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Blackfin Fast JPEG Encoding</title>
	<atom:link href="http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/</link>
	<description>A bunch of random musings, with a leaning towards electronics &#38; computers.</description>
	<lastBuildDate>Sun, 08 Jan 2012 21:10:40 -0800</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Jim Dallas</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-205</link>
		<dc:creator>Jim Dallas</dc:creator>
		<pubDate>Thu, 14 Apr 2011 09:14:45 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-205</guid>
		<description>Hi Frank - really interesting.
We currently use the Analog JPEG libraries on a Blackfin based camera - and are looking at power efficient ways of cuttingboth a jpeg thumbnail - say 256x192 and a &#039;web&#039; image 1024x768 at the same time we cut the 3MP file. We want to do this on the device so we can browse the device images by Bluetooth. e.g. one file would be 1/8th by 1/8th, the other would be 1/2 x 1/2.

Presumably one way we could do this would be to take your code, then add a downsample from the YUV in memory(two steps) and cut two new jpegs ?  Another way could be to do something with the JPEG at ain interim stage - but my guess is that gets tricky  (e.g. is there an easyish way to transform a 2x2 set of DCTs to a single DCT at some point in the encode - then send that to be packed up into a new 1/4 size JPeg - is there a way at the end to take a Jpeg and reduce by 1/2 x 1/2 without decoding) . Just musing - will post back if we find anything

Thanks
Jim</description>
		<content:encoded><![CDATA[<p>Hi Frank &#8211; really interesting.<br />
We currently use the Analog JPEG libraries on a Blackfin based camera &#8211; and are looking at power efficient ways of cuttingboth a jpeg thumbnail &#8211; say 256&#215;192 and a &#8216;web&#8217; image 1024&#215;768 at the same time we cut the 3MP file. We want to do this on the device so we can browse the device images by Bluetooth. e.g. one file would be 1/8th by 1/8th, the other would be 1/2 x 1/2.</p>
<p>Presumably one way we could do this would be to take your code, then add a downsample from the YUV in memory(two steps) and cut two new jpegs ?  Another way could be to do something with the JPEG at ain interim stage &#8211; but my guess is that gets tricky  (e.g. is there an easyish way to transform a 2&#215;2 set of DCTs to a single DCT at some point in the encode &#8211; then send that to be packed up into a new 1/4 size JPeg &#8211; is there a way at the end to take a Jpeg and reduce by 1/2 x 1/2 without decoding) . Just musing &#8211; will post back if we find anything</p>
<p>Thanks<br />
Jim</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: André</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-183</link>
		<dc:creator>André</dc:creator>
		<pubDate>Thu, 02 Dec 2010 18:53:35 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-183</guid>
		<description>thanks for putting this together. While adapting it to my needs I came across the fact that the huffman coding is taking a lot of time. The following code on a i7 using gcc 4.4.1 brought a 25% increase for the huffman coding. This is a code fragment. huf_ctx just has the context, huf_type can be either UINT32 or UINT64. The real increase is due to putbyte dealing with the probable case first.

Regards,

André

static __inline__ void putbyte(huf_ctx *hc, UINT8 data)
{
	*hc-&gt;out = data;
	hc-&gt;out++;
	if (data != 0xff)
		return;
	*hc-&gt;out = 0;
	hc-&gt;out++;
}

static __inline__ void putbits(huf_ctx *hc, int numbits, UINT32 data)
{
    int bits_in_next_word;

    bits_in_next_word = (hc-&gt;bitindex + numbits - sizeof(huf_type)*8);
    if (bits_in_next_word lcode = (hc-&gt;lcode &lt;bitindex += numbits;
    }
    else
    {
        hc-&gt;lcode = (hc-&gt;lcode &lt;bitindex)) &#124; (data &gt;&gt; bits_in_next_word);
        switch (sizeof(huf_type))
        {
        case 8:
            putbyte(hc, hc-&gt;lcode &gt;&gt; 56);
            putbyte(hc, hc-&gt;lcode &gt;&gt; 48);
            putbyte(hc, hc-&gt;lcode &gt;&gt; 40);
            putbyte(hc, hc-&gt;lcode &gt;&gt; 32);
        case 4:
            putbyte(hc, hc-&gt;lcode &gt;&gt; 24);
            putbyte(hc, hc-&gt;lcode &gt;&gt; 16);
        case 2:
            putbyte(hc, hc-&gt;lcode &gt;&gt;  8);
            putbyte(hc, hc-&gt;lcode);
        }
        hc-&gt;lcode = data;
        hc-&gt;bitindex = bits_in_next_word;
    }
}</description>
		<content:encoded><![CDATA[<p>thanks for putting this together. While adapting it to my needs I came across the fact that the huffman coding is taking a lot of time. The following code on a i7 using gcc 4.4.1 brought a 25% increase for the huffman coding. This is a code fragment. huf_ctx just has the context, huf_type can be either UINT32 or UINT64. The real increase is due to putbyte dealing with the probable case first.</p>
<p>Regards,</p>
<p>André</p>
<p>static __inline__ void putbyte(huf_ctx *hc, UINT8 data)<br />
{<br />
	*hc-&gt;out = data;<br />
	hc-&gt;out++;<br />
	if (data != 0xff)<br />
		return;<br />
	*hc-&gt;out = 0;<br />
	hc-&gt;out++;<br />
}</p>
<p>static __inline__ void putbits(huf_ctx *hc, int numbits, UINT32 data)<br />
{<br />
    int bits_in_next_word;</p>
<p>    bits_in_next_word = (hc-&gt;bitindex + numbits &#8211; sizeof(huf_type)*8);<br />
    if (bits_in_next_word lcode = (hc-&gt;lcode &lt;bitindex += numbits;<br />
    }<br />
    else<br />
    {<br />
        hc-&gt;lcode = (hc-&gt;lcode &lt;bitindex)) | (data &gt;&gt; bits_in_next_word);<br />
        switch (sizeof(huf_type))<br />
        {<br />
        case 8:<br />
            putbyte(hc, hc-&gt;lcode &gt;&gt; 56);<br />
            putbyte(hc, hc-&gt;lcode &gt;&gt; 48);<br />
            putbyte(hc, hc-&gt;lcode &gt;&gt; 40);<br />
            putbyte(hc, hc-&gt;lcode &gt;&gt; 32);<br />
        case 4:<br />
            putbyte(hc, hc-&gt;lcode &gt;&gt; 24);<br />
            putbyte(hc, hc-&gt;lcode &gt;&gt; 16);<br />
        case 2:<br />
            putbyte(hc, hc-&gt;lcode &gt;&gt;  8);<br />
            putbyte(hc, hc-&gt;lcode);<br />
        }<br />
        hc-&gt;lcode = data;<br />
        hc-&gt;bitindex = bits_in_next_word;<br />
    }<br />
}</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: frank</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-167</link>
		<dc:creator>frank</dc:creator>
		<pubDate>Mon, 02 Aug 2010 21:50:17 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-167</guid>
		<description>Glad to hear it&#039;s working for you Martin. That&#039;s good news. The real thanks of course must go to the folks at &lt;a href=&quot;http://www.surveyor.com&quot; rel=&quot;nofollow&quot;&gt;surveyor.com&lt;/a&gt; who created the initial code. They did a great job.</description>
		<content:encoded><![CDATA[<p>Glad to hear it&#8217;s working for you Martin. That&#8217;s good news. The real thanks of course must go to the folks at <a href="http://www.surveyor.com" rel="nofollow">surveyor.com</a> who created the initial code. They did a great job.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Martin Banky</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-166</link>
		<dc:creator>Martin Banky</dc:creator>
		<pubDate>Mon, 02 Aug 2010 21:31:54 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-166</guid>
		<description>Frank,

I just wanted to thank you for this implementation. I&#039;m using it on an i486 800MHz SBC, with uClibc and a highly modified version of Palantir. With your code (modified for my needs and ported to C++), Palantir went from 13.25fps to 25.25fps! An absolutely incredible increase in speed!

Martin</description>
		<content:encoded><![CDATA[<p>Frank,</p>
<p>I just wanted to thank you for this implementation. I&#8217;m using it on an i486 800MHz SBC, with uClibc and a highly modified version of Palantir. With your code (modified for my needs and ported to C++), Palantir went from 13.25fps to 25.25fps! An absolutely incredible increase in speed!</p>
<p>Martin</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: frank</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-154</link>
		<dc:creator>frank</dc:creator>
		<pubDate>Fri, 25 Jun 2010 05:09:58 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-154</guid>
		<description>First off, run the code through a profiler to see which functions are taking the most time. It might not even be your code - you might be calling some really slow libraries. Then try to optimise the code algorithmically in C as much as you can. Only when you&#039;ve done all you can that way, do you take the painful step of writing assembler.</description>
		<content:encoded><![CDATA[<p>First off, run the code through a profiler to see which functions are taking the most time. It might not even be your code &#8211; you might be calling some really slow libraries. Then try to optimise the code algorithmically in C as much as you can. Only when you&#8217;ve done all you can that way, do you take the painful step of writing assembler.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bill Strahan</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-153</link>
		<dc:creator>Bill Strahan</dc:creator>
		<pubDate>Fri, 25 Jun 2010 03:23:26 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-153</guid>
		<description>What about a REALLY fast jpeg decoder?  I&#039;ve got a CE app that we distribute on an XScale based device, written in .net.  I&#039;ve put together a large C++ library to handle some image stuff (rotation, zooming, etc.) but the bottleneck is how quicky he images are being decoded by .net.

They&#039;re just 256X256 tiles but they take more than 100ms to decode on a 624 mhz processor.  I&#039;m sure it could be MUCH faster, but dont&#039; know where to begin.

Bill</description>
		<content:encoded><![CDATA[<p>What about a REALLY fast jpeg decoder?  I&#8217;ve got a CE app that we distribute on an XScale based device, written in .net.  I&#8217;ve put together a large C++ library to handle some image stuff (rotation, zooming, etc.) but the bottleneck is how quicky he images are being decoded by .net.</p>
<p>They&#8217;re just 256X256 tiles but they take more than 100ms to decode on a 624 mhz processor.  I&#8217;m sure it could be MUCH faster, but dont&#8217; know where to begin.</p>
<p>Bill</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Howell</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-117</link>
		<dc:creator>Steve Howell</dc:creator>
		<pubDate>Mon, 04 Jan 2010 16:14:41 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-117</guid>
		<description>Hi Frank,

Thanks for the very quick reply! I don&#039;t think it&#039;s a 64/32 bit problem. I have 32 bit int&#039;s, 16 bit short&#039;s and 8 bit char&#039;s, as expected by the code. Thanks anyway. I&#039;ll keep investigating.

Steve</description>
		<content:encoded><![CDATA[<p>Hi Frank,</p>
<p>Thanks for the very quick reply! I don&#8217;t think it&#8217;s a 64/32 bit problem. I have 32 bit int&#8217;s, 16 bit short&#8217;s and 8 bit char&#8217;s, as expected by the code. Thanks anyway. I&#8217;ll keep investigating.</p>
<p>Steve</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: frank</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-116</link>
		<dc:creator>frank</dc:creator>
		<pubDate>Mon, 04 Jan 2010 16:04:32 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-116</guid>
		<description>Hi Steve,

Hmm, I don&#039;t know. At one point I had a similar problem to what you&#039;re describing, where the JPG image started out correct but then got messed up. It was due to the assembly language DCT algorithm being used at the time. Switching to the C code DCT function fixed the problem, and I then wrote an assembly version of that C code to make it run faster. Is there any chance this might be a data type problem (eg you&#039;re running on a 64 bit machine when this code perhaps assumes 32 bits) or something like that?  

Good luck!

Frank.</description>
		<content:encoded><![CDATA[<p>Hi Steve,</p>
<p>Hmm, I don&#8217;t know. At one point I had a similar problem to what you&#8217;re describing, where the JPG image started out correct but then got messed up. It was due to the assembly language DCT algorithm being used at the time. Switching to the C code DCT function fixed the problem, and I then wrote an assembly version of that C code to make it run faster. Is there any chance this might be a data type problem (eg you&#8217;re running on a 64 bit machine when this code perhaps assumes 32 bits) or something like that?  </p>
<p>Good luck!</p>
<p>Frank.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Howell</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-115</link>
		<dc:creator>Steve Howell</dc:creator>
		<pubDate>Mon, 04 Jan 2010 15:42:22 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-115</guid>
		<description>Hi,

I&#039;ve been trying to build and use this jpeg code in a Microsoft Visual C++ project to convert 24 bit RGB images to JPG. I removed the __attribute__ stuff and switched it back to using all the C routines instead of the assembler ones, but I can&#039;t get it to work properly. It seems to produce a JPEG file in which the first few 8x8 blocks of pixels look correct but the rest of the image is corrupted. On quality level 8 the length of the strip of blocks which look right is longer than it is on quality level 1. I don&#039;t suppose you&#039;d have any idea what I might have got wrong?

The only changes I made to the code were:
Remove the __attribute__s.
In &quot;encodeMCU&quot;, called &quot;quantization&quot; instead of &quot;quantization_asm&quot;.
In &quot;DCT&quot;, commented out the part were it just calls &quot;jpegdct&quot; and returns.
In &quot;huffman&quot;, made it call the PUTBITS macro instead of &quot;putbits_asm&quot;.
In &quot;read_rgb24_format&quot;, I changed the &quot;#if 0&quot;s to &quot;#if 1&quot;s and vice versa, this stopping it using &quot;read_rgb_yuv&quot;.

Any help gratefully accepted!

Thanks

Steve Howell</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>I&#8217;ve been trying to build and use this jpeg code in a Microsoft Visual C++ project to convert 24 bit RGB images to JPG. I removed the __attribute__ stuff and switched it back to using all the C routines instead of the assembler ones, but I can&#8217;t get it to work properly. It seems to produce a JPEG file in which the first few 8&#215;8 blocks of pixels look correct but the rest of the image is corrupted. On quality level 8 the length of the strip of blocks which look right is longer than it is on quality level 1. I don&#8217;t suppose you&#8217;d have any idea what I might have got wrong?</p>
<p>The only changes I made to the code were:<br />
Remove the __attribute__s.<br />
In &#8220;encodeMCU&#8221;, called &#8220;quantization&#8221; instead of &#8220;quantization_asm&#8221;.<br />
In &#8220;DCT&#8221;, commented out the part were it just calls &#8220;jpegdct&#8221; and returns.<br />
In &#8220;huffman&#8221;, made it call the PUTBITS macro instead of &#8220;putbits_asm&#8221;.<br />
In &#8220;read_rgb24_format&#8221;, I changed the &#8220;#if 0&#8243;s to &#8220;#if 1&#8243;s and vice versa, this stopping it using &#8220;read_rgb_yuv&#8221;.</p>
<p>Any help gratefully accepted!</p>
<p>Thanks</p>
<p>Steve Howell</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Franklin</title>
		<link>http://blog.frankvh.com/2009/06/09/blackfin-fast-jpeg-encoding/comment-page-1/#comment-109</link>
		<dc:creator>Franklin</dc:creator>
		<pubDate>Fri, 30 Oct 2009 19:08:44 +0000</pubDate>
		<guid isPermaLink="false">http://blog.frankvh.com/?p=31#comment-109</guid>
		<description>Frank,
You are a genius! The following change produces the expected picture:

INT16 *CB_Ptr = CR;//CB;
INT16 *CR_Ptr = CB;//CR;

Next step, the inverted image.

Thanks a lot
Franklin</description>
		<content:encoded><![CDATA[<p>Frank,<br />
You are a genius! The following change produces the expected picture:</p>
<p>INT16 *CB_Ptr = CR;//CB;<br />
INT16 *CR_Ptr = CB;//CR;</p>
<p>Next step, the inverted image.</p>
<p>Thanks a lot<br />
Franklin</p>
]]></content:encoded>
	</item>
</channel>
</rss>

