Not all that deep since I'm far from an expert on this niche of sensor theory but perhaps still useful to those who are curious.
Theory of Operation
Most image sensors are color blind and achieve color sampling by placing colored filters in front of each pixel. The filters are typically arranged in a repeating 2x2 RGGB pattern, with green pixels representing 50% of the total pixels and red and green 25% each. This means each pixel in a raw file has only green, red, or blue data - not all three. The monochromatic values between these are interpolated during raw processing to simulate full color images, with some inherent downsides including lower chroma sampling and demosaicing artifacts.
Pixel Shift uses multiple exposures and the sensor's IBIS mechanism to shift the position of the sensor in increments of 1 pixel for each exposure so that every "scene pixel" will be sampled by all four of the bayer color filters in the 2x2 RGGB pattern. This yields a full RGB triplet for every "scene pixel" vs the single monochromatic value obtained in a single exposure. This produces three benefits: increases the chroma (color) resolution of the image, reduces the noise of each pixel (from mean/median averaging of the multiple exposures), and eliminates demosaicing artifacts because the resulting image contains linear RGB that doesn't need to be demosaiced by the raw processor.
1. Take first exposure at baseline sensor position
2. Shift the sensor one pixel right and take second exposure
3. Shift the sensor one pixel down and take third exposure
4. Shift the sensor one pixel left and take fourth exposure
The result is 4 raw files, which must be later assembled into a single composite raw using Nikon's[ url=https://downloadcenter.nikonimglib.com/en/products/564/NX_Studio.html]NX Studio software[/url], which looks like this:
Increased Resolution
Another technique that can be optionally employed is to shift the sensor by a fractional amount, specifically 1/2 a pixel. This can simulate a sensor with double the linear pixel density in both the horizontal and vertical axis (shifting both up/down), resulting in an image with 4x the total number of pixels. There are lots of practical factors which limit the increase in effective resolution though.
Here's a diagram showing how the Df moves the sensor for a 16-shot pixel shift sequence. Note that it still performs the chroma-enhancing 4-shift operation on top of the 1/2 pixel shift: https://photos.smugmug.com/photos/i-qz4R2JF/0/b17491ce/O/i-qz4R2JF.png
I generated the above diagram by analyzing the 16 raw files created by a 16-shot shift sequence and noting which direction the image shifted for each exposure and by how much, ie 1 pixel vs 1/2 sub-pixel. Here's a visual animation showing a sample of the image data I used.
Note that the direction of sensor movement is opposite that of the image movement. For example, shifting the sensor right by one pixel actually shifts the captured image left by one pixel. There is a significant chance I messed the process up but even so hopefully you'll get the gist
Pixel Shift options and their effect on resolution and noise
The Zf offers four possible shot counts for a pixel shift:
4-shot: Same resolution / pixel count as single exposure but with full chroma sampling and lower noise from mean/median averaging of the exposures
8-shot: Same resolution / pixel count as 4-shot mode but even lower noise from additional averaging of extra exposures.
16-shot: 4x the pixel count of a single exposure and with full chroma sampling. I'm being careful to not say 4x the resolution since the effective increase in resolution will be limited by factors that will be considered in a future post.
32-shot: Same pixel count / chroma as the 16-shot but even lower noise from additional averaging of extra exposures.
Interesting note about the 8-shot and 32-shot: the sensor actually shifts into new sensor rows for the extra exposures rather than shifting back to the original 2 rows of the 4-shot and 16-shot methods. I may touch on this in a later post.
There are additional workflow options as well, including delay to start of the sequence and the inter-exposure delay. Here is a short menu walkthrough demonstrating the options:
Sample Results
I'll now present some samples showing the benefits of pixel shifting in action. First we'll look at the reduction of demosaicing artifacts (aliasing and moiré) as well as the increase in resolved detail:
The above is an enlarged crop from each of the respective images, all upscaled to the resolution of the 16-shot exposure via PS Preserve Details 2.0. Note I show both capture-sharpening only samples (ie, ACR raw sharpening, set to 45/0.7/35), as well as capture sharpening + USM sharpening in PS (using 300/0.5). USM is used to demonstrate how the 16-shot pixel shift can tolerate (and actually require) additional sharpening vs the single exposure and 4-shot shift. I hope to go more into this in a later post.
I chose to use the same USM values for all samples, even though it creates obvious over-sharpening artifacts for all samples other than the 16-shot shift. I did this because using right-sized sharpening for each sample runs the risk of becoming subjective. Still, I plan to do some more nuanced sharpening comparisons in a future post.
When comparing the images, try to find areas of text in the chart that are unreadable in the single exposure but readable in the pixel-shifted images. And of course compare that to the native higher resolution of the Z7 sample. Also note the reduction/elimination of aliasing/false detail.
Here are animations of the same crops to help in visualizing the differences. Unfortunately the Z7 sample is slightly rotated - I chose not to correct that in post for this comparison since doing so would affect its acuity.
There's a significant decrease in visible noise between the single exposure and 4-shot in the above animation. There's actually a similar noise reduction between the 4-shot and 16-shot but it's less perceptible.
I hope to shoot some additional test scenes, preferably some with actual color to better highlight the increase in chroma resolution.
Thank you for generosity sharing this quite an extensive works. The reduction in noise and clean pixels looks great. How does that translate to actual image is I am most interested in. I have seen some impressive renditions from Foveon sensors. I am curious if pixel shift could match Foveon in some circumstances .Also wondering that with 4 shifts, exposure lags would be far less, hence reducing the risks of subject movement significantly or not.
In the OP I compared sharpness by upsampling the single exposure and 4-shot shift from their native 24MP to 96MP, to match the 16-shot's native 96MP. I used Photoshop's Preserve Details 2.0 resampling algorithm, which is the preferred method for upsampling. The problem is that Preserve Details 2.0 involves some implicit sharpening effects, which means the upsampled single exposure and 4-shot shift looks perceptually sharper than the 16-shot since the later was presented at its native resolution without upsampled sharpening.
To demonstrate this effect here is an animation showing the 4-shot shift upsampled to 96MP with two different methods - Preserve Details 2.0 and Bicubic Smoother. I compare both to the 16-shot shift at its native 96MP resolution:
Notice how the 4-shot that's upsampled via Bicubic Smoother has about the same perceptual sharpening visible vs the native 16-shot image - this is good and what we want. From that we can tell the 16-shot actually resolves more detail, without any sharpening differences to make the comparison more difficult. Notice also how the 4-shot upsampled via Preserve Details 2.0 makes that comparison more difficult due to its implicit sharpening.
Few ways to solve this. How about upsampling both the 4-shot shift and 16-shot via Preserve Details 2.0 to a common higher resolution, say 192MP. That way both images undergo the algorithm's sharpening effects. Here's how that looks:
Hmm, the 4-shot still has more sharpening apparent vs the 16-shot even though both were upsampled via the same PD 2.0. I believe that's because the 4-shot undergoes a greater upsample (24MP -> 192MP) vs the 16-shot (96MP -> 192MP), and so the algorithm results in more apparent sharpening.
How about if we upsample both via Bicubic Smoother:
In summary, there are two valid ways I see of comparing the pixel shifted images. One is to upsample the lower-resolution versions via Bicubic Smoother to match the resolution of the highest image presented. The other is to upsample all images to a common higher resolution, also with Bicubic Smoother.
Did you look at how successful Nikon is at combining moving element in the frame such as leaves, trees etc. I played a bit with pixel shift in Fuji GFX 100 but stopped using it for a long time now
as I found it is almost impossible to take a picture of something that has no moving object at all
over the entire pixel shift sequence (for the type of subjects that I shoot).Fuji program was not all that successful at combining these slight motion artifact unfortunately.
I have been trying pixel shift out a fair bit and it's not bad but underexposed areas that are corrected seem to show banding to me. After many tests, I am sticking to my Z9 for images that are going to be taken in those more challenging environments.
I noticed a peculiar artifact issue early on with 16-shot shits but revisited it today and it seems to occur on every 16-shot shift I do. Doesn't happen on the 32-shot shifts, which is the other sub-pixel shift mode the camera supports. Also doesn't occur on the full-pixel shift modes (4/8 shot). Here's an animation demonstrating the issue:
It's barely visible at 100% without processing but because obvious when the image is sharpened. I've enlarged it to 500% with nearest-neighbor to make it more obvious. Seems to repeat at fixed row intervals (approx every 23 rows or so). Doesn't occur on all rows.
I'm certain it's not movement with my setup - I'm working on a concrete surface with excellent support. Not sure why it wouldn't show on the 32-shot shift - perhaps the artifact is median-blended away with whatever algorithm NX Studio is using.
Not a big fan of pixel shift myself. The resolution is awesome but I find in practice, for my use which is mostly outdoor, not in controlled environments, you always get some trees or leaves moving, cloud slightly shifted etc. After playing with pixel shift a bit on GFX100 and 50s, I pretty much gave up on
pixel shift as I often find artifacts from combining multiple exposures when something moved. The only
time when pixel shift shines was when I was helping my friend doing a little archive work on his collection of paintings.
Have not tried Nikon pixel shift yet but how do you find Nikon or rather Nikon's software at handling object that moves slightly
during pixel shift mode?