Home · Register · Join Upload & Sell

Moderated by: Fred Miranda
Username  

  New fredmiranda.com Mobile Site
  New Feature: SMS Notification alert
  New Feature: Buy & Sell Watchlist
  

FM Forums | Post-processing & Printing | Join Upload & Sell

  

Archive 2013 · Lightroom 5 Performance Testing: Pt.1 - Library Module

  
 
15Bit
Offline
• • • • •
Upload & Sell: Off
p.1 #1 · p.1 #1 · Lightroom 5 Performance Testing: Pt.1 - Library Module


Some of you will hopefully remember that from time to time I try to do semi-thorough performance evaluations of Lightroom. Just to refresh your memories, the old ones are here:

https://www.fredmiranda.com/forum/topic/1165920
https://www.fredmiranda.com/forum/topic/1123725
https://www.fredmiranda.com/forum/topic/1005214

The goal of these tests is to see where LR runs fast and where it doesn’t, and to provide some fact-based insight into the software which we can all use to help us to speed up our post processing and minimise the stresses that come from sitting in front of a computer that doesn’t respond as well as fast as we’d like.

The system used for testing is:

Intel i5-3570K clocked at various speeds from 1.6Ghz to 4.3Ghz, with turbo boost off.
Asus P8Z77-V-Pro motherboard
16Gb of DDR3-1600 RAM
128Gb Samsung 830 SSD - Boot drive (mounted as C-Drive)
80Gb Intel X25 G2 SSD – Images, LR catalogue, ACR temp dir (mounted as G-Drive)
Assorted other hard disks which don’t come into play here.

I am comparing LR4.4 and LR5.0

System monitoring is done using the nifty utils “process explorer” and “process monitor”, formally from sysinternals, but now from Microsoft. Timing is done by hand using the stopwatch on my phone.

For general testing, two catalogues were made, totally independent from one another: Though the same images are used for both, 2 different directories are used to house them. All catalogues and images were located on the Intel X-25 SSD (G-Drive). The image directory contained the same 200 images, 100 from a Canon 5D, 100 from a Fuji X10. Apologies to those of you with high MP cameras, but I just don’t have the money (or justification) to upgrade my trusty 5D to something more modern.

To really provide a specific and tough test, and thus hopefully slow things down enough to measure any differences between the software versions, additional separate catalogues comprising a single large image were also used. The image was a panorama made up of several images from a 5D stitched together and saved as a TIF. It has a lot of trees, leaves etc. and so a lot of fine detail to render. The image is 13147 x 4532 pixels and takes up 454Mb on disk.

As these tests tend to get a bit long and complicated, I’ve decided to split them up this time. Today then, is testing of only the Library module. When I get time (hopefully tomorrow) I will do a “Pt. 2” testing out the tools in the Develop module.

Test 1 – Library module: Importing

This test takes a completely fresh catalogue, and adds a folder of images on disk. This occurs via the “import” module with concurrent rendering of 1:1 previews for the 200 images. In order to speed things up, the test was run at the full 4.3Ghz clockspeed.

Results:

LR5 – import time 4 mins 5 secs
LR4.4 – Import time 4 mins 10 secs

I would comment that whilst getting the sysinternals logging software running correctly I repeated these tests a couple of times and there is a little spread in times from run to run of around 5-10 secs, so within that error I am uncomfortable concluding that LR5 is definitely faster than LR4.4, though it does appear slightly so.

Looking at the CPU usage (below), LR5 is clearly utilising all 4 cores, but usage never gets near 100%. The results are pretty much exactly the same as i found for last years testing on LR4.3 so i’m not going to comment more on that.

http://farm4.staticflickr.com/3823/9047788283_efd32b3888_c.jpg

Test 2 - Performance Scaling of Importing

Because there numbers above are so similar I chose not to repeat the in-depth core scaling that I did last time, and just do a couple of quick tests with LR5: Compare 2 cores with 4 cores, and compare 1.6Ghz with 3.2Ghz and 4.3Ghz.

Results:

LR5 – Import time with 2 cores at 4.3Ghz = 5min 46sec
LR5 – Import time with 4 cores at 1.6Ghz = 9min 51sec
LR5 – Import time with 4 cores at 3.2Ghz = 5min 17sec

We can see a few things from these numbers.

- Firstly, importing into LR5 does not scale well with core count. I found the same sort of scaling for LR4.3 last year, and nothing seems to have changed in LR5: Doubling from 2 to 4 cores gives only a 1.41x speed up. I am surprised at how poor that is – I would expect much better scaling for a task where the software has the option to spin out one image per core.

- Secondly, importing into LR5 does scale well with Ghz. Doubling from 1.6Ghz to 3.2Ghz gives a 1.86x speed up. Not quite linear, but very good nevertheless.

- Thirdly, a heavily overclocked dual core is not very much slower than a quad core at standard speed.

It is of course interesting to also look at the file I/O that goes on during import, particularly to gauge where to use your SSD if you have one. This is what I see:

http://farm8.staticflickr.com/7355/9047788395_81c99dbe4b_c.jpg

I’ve closed down the file tree to keep things compact in the image, so I should add that the I/O to C-drive is all to c:\users\username\AppData\Local\Temp where LR appears to keep some temp files.

On G-Drive most activity is reading the images and writing out ACR cache and previews. Around 3.2Gb of data is read (for reference, there are 2.8Gb of images) and 725Mb is written. Most of the write I/O is to the previews directory, with a small amount of activity to the catalogue as well, as you would expect. In total we have around 725Mb data going into the catalogue directory. Most of this is the previews (714Mb data), with the remaining I/O going to the database in some way.

I’m not a programmer, so I confess I don’t really understand why there have to be so many file events accompanying the data reading and writing. Is it really necessary have 18k file opens and 34k reads to process 200 images? And how does that affect performance?

I note i also recorded the file I/O for LR4, but it was to all intents identical to LR5.

Test 3 – Browsing in the Library module

As for my LR4.3 testing, I cleared out the caches and then clicked and zoomed on images in the Library module, forcing them to render 1:1. This test was done with the CPU set to 4 cores at 1.6Ghz, just to emphasize any differences. However, given the results above I expected to see the same results for LR5 and LR4. I was not disappointed in this prediction.

LR5 CPU usage:

http://farm8.staticflickr.com/7284/9050793938_65fb6f5702_b.jpg

LR4.4 CPU usage

http://farm4.staticflickr.com/3789/9048657463_a10c1cc02a_c.jpg

Again, LR is using all 4 cores, but not getting close to 100% usage.

I would add that subjectively I didn’t “feel” any difference between the two versions.

Again it is interesting to see the accompanying file access stats. They tell exactly the same story as for the import and rendering of 1:1 previews test

LR5 File access

http://farm8.staticflickr.com/7406/9048563357_c8715eab94_c.jpg

Test 4 – So is the LR5 develop module faster to use than LR4?

From the CPU usage plots above, LR5 is clearly pretty similar to LR4 in the way it is rendering the previews. It is suggested by the import times that LR5 *might* be slightly faster, though the result is hardly conclusive and LR5 didn’t feel more snappy when I was browsing the catalogue of 200 images. In order to really slow things down then, this test is to see how LR will render my big pano image when I zoom into it. The CPU is set at 4 cores running at 1.6Ghz, and the test is repeated 5 times and the average time taken for the image to go “sharp” is given.

Results:

LR5 – Average 14.4 sec
LR4.4 – Average 15.3 sec

And the CPU load:

http://farm3.staticflickr.com/2892/9052176270_bd6fd616db.jpg

Well I didn’t believe this when I measured it, so I repeated it and took care with the timing. There is a spread of 0.4 secs in the timings (which isn’t bad for a manual stopwatch), so you could argue that the averages are within errors of each other, but I think the result is real – LR5 does render the pano about a second faster. Both LR4 and LR5 give similar CPU load profiles, and both use all 4 cores at 100%, but LR5 is just a touch faster.

And just for the hell of it, I also checked how this test scales:

LR5, 4 cores at 3.2Ghz – Average 7.7 sec
LR5, 2 cores at 4.3Ghz – Average 10.5 sec
LR5, 4 cores at 4.3Ghz – Average 7.0 sec

So going from 2 to 4 cores (at 4.3Ghz) gives a 1.5x speed up. Going from 1.6Ghz to 3.2Ghz (4 cores) gives a 1.9x speed up. By and large, these are the same as we saw for importing and rendering the 200 images in test 2, though there is a more pronounced difference between the timing for 2 [email protected] vs 4 [email protected].


Conclusion

Given the similar import times and very similar file I/O I would conclude that Adobe have re-used the same rendering code in LR5 as LR4 (big surprise). There is an indication that they might have tweaked it a touch, but there is no huge speed-up that the user will notice.

Performance scaling is quite interesting. Firstly, performance scales much better with Mhz than with core count. We saw this with LR4.3 last year too, and I am going to make the same conclusion as I did then and say that the performance sweet spot is an overclocked quad core i5-3570K or i7-3770K. A hex core i7-3930K might well give you a bit more oomph, but it’s not going to be at a good performance per $.

There is perhaps an insight into the rendering engine in the scaling results too. Notice that when going from 2 to 4 cores there is a 1.4x speed-up for rendering 200 images and a 1.5x speed up for rendering the big pano? Obviously the engine is fully threaded as we see the CPU use flatline at 100% for the pano render, but the numbers sort of suggest that the rendering engine throws all available cores at each image irrespective of the type of workload: For the relatively small 5D and X10 images I have, the CPU never gets to 100% when rendering and so scaling is worse than for the pano render. This also explains the results for 2 [email protected] compared with 4 [email protected]. If LR were to more intelligently assess the workload and divide up rendering according to image size (i.e. assign 1 image per core or per 2 cores for smaller image sizes) I wonder if the scaling wouldn’t improve quite a lot for higher core-count PC's.

With respect to hard disks, the headline result is pretty obvious – there is a lot of reading from the image directory and a lot of writing to the ACR cache and preview directories. C-drive is also touched, but not so much. The sticky question is what impact all those file opens and closes have on performance and how much that favours SSD’s over spinning disks.




Jun 15, 2013 at 12:43 PM
WAYCOOL
Offline
• • • •
Upload & Sell: Off
p.1 #2 · p.1 #2 · Lightroom 5 Performance Testing: Pt.1 - Library Module


With such fine testing we can now ignore all the LR is way faster and way slower than LR4 treads Nice to see a small speed bump though.


Jun 15, 2013 at 12:58 PM
rick2906
Offline
• • • •
Upload & Sell: Off
p.1 #3 · p.1 #3 · Lightroom 5 Performance Testing: Pt.1 - Library Module


Wow, thanks for the very detailed testing!! I was wandering if lr5 was faster!! it is but not much improvement.


Jun 15, 2013 at 01:01 PM
OntheRez
Offline
• • • • •
Upload & Sell: On
p.1 #4 · p.1 #4 · Lightroom 5 Performance Testing: Pt.1 - Library Module


First, thanks for taking the time to do the testing. Very interesting and useful. I think it's reasonable to say that within the margin of error of manual timing there is little or no difference in the I/0 performance between the versions. As for rendering you seem to have recorded an improvement of ~6-7%. Not exactly stunning. It also comes as no surprise that faster and more cores makes a difference though it is discouraging that simply increasing CPU speed gets a proportionally greater gain than increasing processors. The only conclusion I can draw is that Adobe's programmers still aren't predictively queuing and using all the pipes available to them. In other words the code isn't optimized for multiple processors. Given how long they have been available, that is sad. So I guess the justification for upgrading needs to come from new/better/more efficient tools. Haven't heard much said about that thus far.

Thanks again 15Bit. That took a spot of work!

Robert



Jun 15, 2013 at 05:12 PM
mshi
Offline
• • • •
[X]
p.1 #5 · p.1 #5 · Lightroom 5 Performance Testing: Pt.1 - Library Module


the root cause of LR's performance issue is entirely tied to the open-source file-based database engine technology that Adobe opted to use to save money. google that, and you will know the sooner you ditch that, the better.


Jun 15, 2013 at 06:47 PM
aubsxc
Offline
• • •
Upload & Sell: Off
p.1 #6 · p.1 #6 · Lightroom 5 Performance Testing: Pt.1 - Library Module


15bit, excellent work as usual. Thank you for taking the time to run these tests and record and interpret the results. It is nice to see how well the processing times scale with clockspeed, and I think it is safe to say that the sweet spot for LR users is an inexpensive 3570K or 2500k overclocked to 4.2to 4.6 GHz.

Do you have access to i7 processors with HT? It would be interesting to compare the non ht i5 with the ht i7 cpus from the same generation to see how effective Intel's ht is in this application.



Jun 15, 2013 at 07:10 PM
John Caldwell
Offline
• • • •
Upload & Sell: On
p.1 #7 · p.1 #7 · Lightroom 5 Performance Testing: Pt.1 - Library Module


Thank you truly for this work and reportage. While much of this was over my head, your conclusions were completely understandable to me.

The impact of cataloging to a SSD drive remains of interest to me, and I'll be following what you and others say on that subject as we move forward with LR5.

Thanks again.

John Caldwell



Jun 15, 2013 at 07:44 PM
Allynb
Offline
• • •
Upload & Sell: Off
p.1 #8 · p.1 #8 · Lightroom 5 Performance Testing: Pt.1 - Library Module


Thanks for sharing your findings.


Jun 15, 2013 at 10:14 PM
15Bit
Offline
• • • • •
Upload & Sell: Off
p.1 #9 · p.1 #9 · Lightroom 5 Performance Testing: Pt.1 - Library Module


Thanks guys. Stick around cos there is more coming. Pt 2 should deal with the develop module, and there will be a short Pt3 looking at the impact of SSD's.


Thanks again 15Bit. That took a spot of work!


Yes it did. More than i expected in fact.


the root cause of LR's performance issue is entirely tied to the open-source file-based database engine technology that Adobe opted to use to save money


I don't think that is the case here - these tests are completely dominated by the rendering time, which is all about I/O and CPU grunt, and how well the multiple cores are utilised. There is some database access of course, but it is actually quite low - just a few thousand events and a few Mb of data.

I expect the Develop module testing will throw up more database access events. It won't impact performance for my testing as i have a very small test catalogue (and even my main catalogue is only 20k images), but seeing heavy access might well be a good pointer.


Do you have access to i7 processors with HT?


Sadly not. I did the research when buying and decided an i7 wasn't worth the extra money for what i do. Perhaps i should have spent the extra when i bought originally, but i definitely can't see the value in replacing my existing chip now.


The impact of cataloging to a SSD drive remains of interest to me


I'm part way through that testing now. The results are a little surprising so far...



Jun 16, 2013 at 01:24 AM
James_N
Offline
• • •
Upload & Sell: Off
p.1 #10 · p.1 #10 · Lightroom 5 Performance Testing: Pt.1 - Library Module


Thanks for doing this test; it is quite informative. I didn't do any formal testing but I installed LR 5 this morning and created a new catalog consisting of 17K images. I was surprised to see 6 operations running concurrently (One import, and five building stamdard previews at first, then all six processes building previews). Windows Task Manager showed all cores consistently running at approx. 60 percent cpu usage.

http://i.imgur.com/eapQDFs.jpg

Perhaps I'm mistaken but I seem to recall that in previous versions LR didn't allow that many concurrent processes and this test suggests that more than 2 concurrent imports was counter-productive: Optimizing Adobe Lightroom



Jun 16, 2013 at 11:12 AM
Squirrely Eyed
Offline
• • •
Upload & Sell: Off
p.1 #11 · p.1 #11 · Lightroom 5 Performance Testing: Pt.1 - Library Module


15Bit wrote:
Test 2 - Performance Scaling of Importing

Results:

LR5 – Import time with 2 cores at 4.3Ghz = 5min 46sec
LR5 – Import time with 4 cores at 1.6Ghz = 9min 51sec
LR5 – Import time with 4 cores at 3.2Ghz = 5min 17sec

We can see a few things from these numbers.

- Firstly, importing into LR5 does not scale well with core count. I found the same sort of scaling for LR4.3 last year, and nothing seems to have changed in LR5: Doubling from 2 to 4 cores gives only a 1.41x speed up. I am surprised at how poor that is –
...Show more

Most likely the lack of scaling by core count is due to the limited 6 MB of L3 cache that is shared among all the cores. This is a common problem in computer performance.

Workloads that scale 1-1 with core count are L3-contained number crunchers.

I would be curious to see how your performance compares at the high-frequency range. While it may scale linearly from 1.6 to 3.2 GHz, does it continue to scale linearly in the 4+ GHz range? I'm not sure what is Intel's DMI bandwidth, but once you saturate it then CPU speed won't scale as well for this type of workload.



Jul 01, 2013 at 09:53 AM
15Bit
Offline
• • • • •
Upload & Sell: Off
p.1 #12 · p.1 #12 · Lightroom 5 Performance Testing: Pt.1 - Library Module


Squirrely Eyed wrote:
Most likely the lack of scaling by core count is due to the limited 6 MB of L3 cache that is shared among all the cores. This is a common problem in computer performance.

It's less of a problem than it used to be, as even a 6Mb cache is actually pretty big. Also, the pre-fetching and caching algorithms on modern CPU's are really excellent. Still, it would be interesting to see the difference a larger cache makes. Short of spending a lot of money i have no way to do it though...

Workloads that scale 1-1 with core count are L3-contained number crunchers.

Not really. For 1:1 scaling you need jobs that have little or no cross communication between threads running on different cores - fitting in the cache helps for sure, but cross-thread communication and operations on the global dataset are what kill performance.

I would be curious to see how your performance compares at the high-frequency range. While it may scale linearly from 1.6 to 3.2 GHz, does it continue to scale linearly in the 4+ GHz range? I'm not sure what is Intel's DMI bandwidth, but once you saturate it then CPU speed won't scale as well for this type of workload.


That would be an interesting test, i agree. Looking at my testing threads i notice i haven't actually done this test - i did do scaling with cores for LR4, but not with Mhz. Maybe when i get home from vacation i will give it a shot.



Jul 01, 2013 at 01:53 PM
Squirrely Eyed
Offline
• • •
Upload & Sell: Off
p.1 #13 · p.1 #13 · Lightroom 5 Performance Testing: Pt.1 - Library Module


15Bit wrote:
Not really. For 1:1 scaling you need jobs that have little or no cross communication between threads running on different cores - fitting in the cache helps for sure, but cross-thread communication and operations on the global dataset are what kill performance.


I should amend my statement as it is patently false. What I meant to say is that L3-contained number crunchers are prime examples of workloads that scale 1-1.



Jul 01, 2013 at 02:21 PM





FM Forums | Post-processing & Printing | Join Upload & Sell

    
 

You are not logged in. Login or Register

Username       Or Reset password



This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.