Using Statistical Analysis for cleanup

Discussion in 'Tools' started by Karyudo, Aug 3, 2019.

  1. Karyudo

    Karyudo Jedi Master

    Could you elaborate on this? What's "the dirt map cleaning method"? And what are "transfer modes"?

    I'm wondering if this is similar to the "ToTooT" Avisynth filter I commissioned years ago, which took as input three aligned sources, and output the "most likely" average pixel for each location, based on which two out of the three sources (or all three, within some small tolerance) were closest in RGB values.
    williarob likes this.
  2. williarob

    williarob Administrator Staff Member

    That sounds useful, I might have to test that out...
    Last edited: Aug 5, 2019
  3. Karyudo

    Karyudo Jedi Master

    OK, turns out I knew exactly what "the dirt map cleaning method" was: semi-automatic, once somebody has done all the hard work once. I was hoping there would be more automation by now.

    Which I think is possible, actually.

    Using ToTooT, I built a near-perfect GOUT from LD before there was a GOUT. Capture three different copies of the same LD issue, and you can all but guarantee the sources are aligned spatially and temporally. But the analog artifacts (video dropouts and random capture noise) will be in different places on each disc. Taking the average of the RGB values of the two most-similar pixels of the three captures will give you a result that's closer to the source material than either capture by itself.

    I really don't think we take advantage of probability and statistics nearly enough. If we rely on one algorithmic method for dust-busting or grain removal or denoising, we expect it's got to be perfect, or we can't use it. But is every method going to be wrong in the same places? I'd argue that the answer is, "unlikely." So what happens if you ToTooT (or TooF or TooS or TooE...) a bunch of "pretty-good" methods?

    I figure you could do a couple of things (at least) with this idea: either be conservative, and make a pretty decent dust map (which will have to be edited and applied with further user intervention); or "go big" and automatically remove all the dust (at the risk of wasting a bunch of time outputting something useless).

    I can go on, if this sounds like it's worth pursuing. (*I* think it is, but I'm biased by the fact I know it works...!)
    camroncamera and dahmage like this.
  4. williarob

    williarob Administrator Staff Member

    Dirt maps can also be generated by the scanner using an infrared camera. The Scanity, Scanstation, Arriscan and other professional scanning equipment will take an infrared picture and put it in the alpha channel of the final image. They are very accurate and I have a couple of scans that came with these dirtmaps. Both PF Clean and Phoenix have tools that can use these dirtmaps for auto cleanup, but they're not the magic bullet people think they are: Sure, only the actual dirt on the print gets targeted, but it is still prone exactly the same issues as dirt that is detected: when there is a lot of motion, or one of those pink flash frames in Star Wars, it patches all the dirt with pixels of the wrong color, making it more obvious than it was before.

    The other way to use the dirtmap is to place a clean or alternate version of the same frame underneath the layer you are trying to clean, and then use the alpha channel of the dirtmap for transparency. With multiple captures of a laserdisc, this would work beautifully - it already aligns perfectly and the colors are the same, but with print scans it is very hard to align them automatically - prints warp, there is gateweave caused by both the film passing through the gate and printed into the film (different for every print), and colors will usually be different. I cleaned Reel 1 of Star Wars this way and it took a lot longer because of how difficult it was to align and color match the sources. If the colors don't match perfectly, you can see the repair layer. If you look at reel 1, particularly the Tantive and desert scenes, you can see a vertical line on the left side - that is the result of imperfect color matching on the layer underneath - it's very close (I used Dre's tool) and side by side they look the same, but it's not close enough.

    So, while I agree that this method is sound and works great when you can perfectly align and color match your source layers, it's just extremely difficult to do with film scans. I ended up ignoring the ditmaps for Reels 2-6 and cleaned the remaining reels (with help from other people) more than twice as fast.
    camroncamera likes this.
  5. Karyudo

    Karyudo Jedi Master

    The difference between generating a dirt map from a scanner (which I know has been possible for years) and my proposed method is that an IR-scan dirt map doesn't say or do anything about what to replace the dirt with.

    Actually, the dirt map (as you've discussed it) is really a red herring: I don't think it's necessary. I think you could go directly to an automated "statistically best" fix without worrying about generating or using a dirt map.

    I also think aligning different film sources is a red herring: again, I don't think it's necessary. I suspect you could go directly to an automated "statistically best" fix without worrying about aligning sources, too.

    I guess I'm going to have to go away and re-learn all the stuff I've forgotten about AviSynth since I used it last maybe a decade ago (like, there's a thing called VapourSynth now?!) and run some tests....
    camroncamera likes this.
  6. williarob

    williarob Administrator Staff Member

    While I agree that the dirtmap isn't necessary, the alignment and color matching most certainly is. I can tell you from experience that even if the prints are misaligned by only a few pixels, you will often see the wrong colored pixels in place of the dirt. Sometimes it doesn't matter - the tatooine desert scenes for example, if the sand being replaced isn't the exact same piece of sand, it probably won't matter, but misalign a piece of a Rebel Trooper's helmet so that a big chunk of it shows up in his forehead and you'll see it immediately! I know, because I did it.

    Statistics are only as good as the data they are based on: Garbage in = garbage out. I can't just take the LPP and the two Tech scans and write an AVISynth script that will stack them "as is" and expect good results. They have to be aligned (if not precisely, then at least as close as possible), color matched (One Tech is quite green, the other is quite yellow and the LPP is quite blue, what do you suppose happens if we just stack them and hope for the best? None of the pixel colors will match 99% of the time), and every frame must be the same frame at the same time (missing frames must be patched so that all sources are perfectly in sync), or it just won't work - you can't patch holes in the desert with frames from the Tantive and expect them to be invisible.

    So my point is that there is a huge amount of prep work that has to be done, before you can run it through a script like this, and in my experience - of which I hope you'll agree I have quite a lot at this point - getting to that point is usually more time consuming than just cleaning the same frames by hand. Believe me - nobody wants a quick and easy way to do this more than me, but there just isn't a "one script to rule them all" solution to this problem. (Though Althor1138's ImageJ scripts are an excellent start - since they do a really nice job of aligning multiple sources, and a fairly good job of matching colors, at least some of the time. Of course, you still have to frame sync your sources before running the macros).
    Last edited: Aug 4, 2019
    camroncamera and oohteedee like this.
  7. Karyudo

    Karyudo Jedi Master

    I hear you about matching sources. I know all about trying to match sources. I've done it (or tried to) for more than a decade for many projects. It was hard enough using low-res video sources; it's practically impossible with film sources without the sort of near-magic warping tools Mike Verta spent good money to have coded in Eastern Europe. But you don't have to align sources if you start with just one source.

    Also, I disagree that statistics are just GIGO. Look at superresolution using wavelets: that can give a distinctly not-garbage result from a bunch of garbage inputs, using statistical methods (on temporal data) to get there.

    The prep work isn't trivial, but it shouldn't be that time-consuming. Only two things need to happen, really: finding a handful of dust-busting algorithms that give decent results on a "one-light" pass through the whole film, and writing the script to do the comparisons.

    I agree that of course there isn't a "one script to rule them all" solution—that's been my whole point thus far—but I think there probably is a "one script to bring them all and in the darkness bind them" solution.
    Last edited: Aug 4, 2019
    camroncamera likes this.
  8. williarob

    williarob Administrator Staff Member

    So your idea is more of a "run it through multiple dustbusting processes and then average the results" sort of thing? Perhaps "averaging" isn't the right word, given the statistical analysis part, but I think I see what you're getting at. That way, there is no need to worry about aligning or color matching or syncing, since it's all from a single source. That's not a bad idea, especially if all you are looking for is a very watchable, quick and dirty cleanup. But it does seem like the kind of solution that would result in missing (or at least faded) laser bolts, sparks, stars, etc., since most algorithms - even expensive solutions like PF Clean and Phoenix - often make mistakes with those sorts of things.

    While doing the inverse ("uncleaning" or putting back details that were accidentally erased) is sometimes quicker - particularly with very dirty source material - it still requires frame by frame analysis with the human eye, and it's very easy to miss things that were accidentally removed - especially stars... Much easier to do the erasing manually than restoring. If somebody's leg is missing, that's easy, but how can you tell if a single star is missing in a panning shot in space just by looking?

    That said, it would certainly be a viable solution for scenes like Luke and the Lars family at dinner, or other mostly static shots... But then again, PF Clean and Phoenix can handle that sort of shot very well already, and in a single pass... I guess if you don't have access to those tools this would be the way to go.

    Sorry- I’m not trying to dissuade you - I am genuinely interested in learning and trying new techniques.
    Last edited: Aug 4, 2019
    dahmage likes this.
  9. Karyudo

    Karyudo Jedi Master

    Yeah, that's about where I'm headed. I figure something like...

    S = source
    A = source, processed by some algo (some spacial-only filter, say)
    B = source, processed by some other algo (some temporal-only filter, say)
    C = source, processed by yet another algo (some spacio-temporal filter, say)
    D = source, processed by... more algos (some grain-reduction filter, say)
    [E, F, G... PF Clean, Phoenix, etc. I suspect more well-constructed options is better than fewer]
    r = some "radius" in per-channel luminance level [i.e. 0-255 for each 8-bit channel in RGB]

    So for each pixel in S:
    ... if the difference between S and A, S and B, S and C, and S and D is <r, then output = S (i.e. "if it ain't broke, don't fix it")
    ... if the difference between S and A, S and B, S and C, or S and D is >r, and if the outlier is S (that is, A, B, C, and D are closer to each other than to S), then use something like the average of the two closest processed sources.

    I suspect this would be fairly hard on dirt (assuming well-chosen algos will bust most of the dust most of the time?), relatively kind to lasers (assuming well-chosen algos may fail, but each will fail differently?), and preserve most of the source most of the time.

    I also suspect the output could be better than relying on the human eye: I remember running ToTooT once on a sample clip using a script that identified frames with flaws, and it found frames that I had to look very hard at to see where the flaws were. (I'll see if I can find my original samples.)

    The beauty of this method (as I see it) is that it would be possible to crowd-source individual filter implementations:
    1) Publish an assemble-edit test clip of maybe 30 seconds or so that includes some representative challenges (near static shot, whip pan, heavy grain, light grain, lasers, explosions, space, sky... whatever makes sense)
    2) Set some rules (must output as such-and-such colour space, must use this particular version of AviSynth or VapourSynth, must run the filter over the whole test clip, can't change clip dimensions or drop frames, can't use any stabilization, can't change overall brightness/contrast/colour temp, etc.)
    3) Accept "entries" as a processed sample clip and the AviSynth/VapourSynth script.

    You'd be able to watch each sample clip to see if it's half-decent; and if it is, then you can add the script that produced it as a processed source to the overall ITDBT ("in the darkness bind them") script.
    dahmage, Moonstrider and williarob like this.
  10. williarob

    williarob Administrator Staff Member

    I like where you're headed. See what you can come up with. Here is a clip for you to play with that contains a nice variety of shots (including starfields) and plenty of real dirt and dust (It's how reel 1 of 4K77 looked before I cleaned it):

    Nate D likes this.
  11. Karyudo

    Karyudo Jedi Master

    Cool. I'll warn you now: it's going to take me quite some time to make visible headway on this. Gotta re-learn (and update) everything I know about AviSynth. Plus figure out who the "cool kids" are that can build good test candidate scripts. Oh, and actually write some pseudo-code and have it crafted into a plugin.
    williarob and Nate D like this.
  12. Karyudo

    Karyudo Jedi Master

    Step One complete: found my original zip archive with TooT in it....
    camroncamera and Nate D like this.
  13. williarob

    williarob Administrator Staff Member

    I think you'll find Avisynth hasn't changed that much - there are probably a bunch of new plugins and built in functions, but for the most part, the script writing syntax remains the same. I still haven't tried vaporsynth for anything...
  14. Karyudo

    Karyudo Jedi Master

    Yeah, so far, so good. Looks like there's a new text editor of choice, and maybe AVS+ is a preferred fork, but it's starting to feel more familiar.
  15. Althor1138

    Althor1138 Padawan

    I think what you are describing could be done with AVS+ and the EXPR filter. It uses Reverse Polish Notation which is always a bitch but I can usually wrap my head around what I'm trying to do after hours of banging my head on the keyboard. I can imagine it would be agonizingly slow working with 4 or more clips but 1 FPS is glorious when you don't have to interact with it and it does a good job.
    Karyudo likes this.
  16. Karyudo

    Karyudo Jedi Master

    Somehow I did a whole engineering degree without an HP, so I still don't know how to use RPN.... But I'll check into EXPR. Thanks!

    (It's like this thread was never anywhere but here... thanks, williarob.)
    Last edited: Aug 5, 2019
    camroncamera likes this.
  17. camroncamera

    camroncamera Jedi Master

    Do any of these imaging processors let the user easily see dirt-only outputs? That is, a movie not of, uh, the movie... but of only the dirt that has been removed, in order to easily see if any picture detail has been mistakenly removed: stars, explosions, blaster bolts, eye glints, etc.... ("Look sir, droids!")

    I might be assuming there would be utility in seeing "just the dirt" that went into the dustbuster in case there's more in that bin than there should be, but maybe it's no easier spotting picture details that were unintentionally removed than simply viewing the cleaned clip. I have not yet had the chance to look into this myself.

    Edited to add:
    I'm excited for the day that the image processing software will be so powerful that we'll easily be able to view "just the dirt", "just the film grain", "just the video noise", "just the film damage", etc., in order to reconstruct (and customize) the best possible version of any film undergoing such processing.
    Last edited: Aug 8, 2019
    dahmage likes this.
  18. Karyudo

    Karyudo Jedi Master

    I would guess that even if a particular image processor doesn't include "dirt only" as an output option, it wouldn't be particularly difficult to generate that output by doing a little math on the original input and cleaned output. I know I've done this sort of thing before, when using AviSynth, to confirm it was actually doing something. I think I also used a similar technique when trying (pretty successfully?) to line up two sources spatially.

    I, too, am looking forward to being able to customize from separated base elements!
    camroncamera likes this.
  19. Karyudo

    Karyudo Jedi Master

    Tiny progress update: Got AviSynth+ installed. Got AvsPmod working. Got LSMASHSource installed and working. Remembered how to Crop() and LanczosResize() (just for previewing). Found Fizick's filters. Couldn't get DeSpot() to work (it's 32-bit). Got AvsPmod *32-bit* working.... Got DeSpot() working; removed a serious ton of dirt (with lots of artifacts, but that's future Karyudo's problem).

    LoadPlugin("C:\Program Files (x86)\AviSynth\plugins\DeSpot.dll")
    LSMASHVideoSource("\\<servername>\Videos\_Misc\Star Wars Uncleaned from williarob 132-140-before.mp4")
    Crop(760, 0, -90, -0)
    DeSpot(p1=12, p2=7, pwidth=70, pheight=70, mthres=25, mwidth=20, mheight=15, interlaced=false,
      \  merode=33, ranked=false, p1percent=0, dilate=0, fitluma=false, blur=0, motpn=false, seg=0) # by Fizick; 32-bit only; not optimized AT ALL
    LanczosResize(1920, 800, taps=3)
    Frame 1138 (heh), output as a JPG direct from AvsPmod:
    Star Wars Uncleaned from williarob 132-140-before001138.jpg

    I know that's not spectacular, but I also don't think it's too shabby for six lines of code. And, as I've postulated above, if it's possible to quickly generate many pretty-good options, that should be enough to math my way to a "better than the sum of the parts" solution.
  20. Karyudo

    Karyudo Jedi Master

    Same frame (1065) as williarob posted above:
    Star Wars Uncleaned from williarob 132-140-before001065.jpg
    DVD-BOY and camroncamera like this.

Share This Page