In the late 1990s, a Dutch electronics technician named Romke Jan Berhnard Sloot announced the development of the Sloot Digital Coding System, a revolutionary advance in data transmission that, he claimed, could reduce a feature-length movie down to a filesize of just 8KB. The decoding algorithm was 370MB, and apparently Sloot demonstrated this to Philips execs, dazzling them by playing 16 movies at the same time from a 64KB chip. After getting a bunch of investors, he mysteriously died on September 11, 1999, two days before he was scheduled to hand over the source code.
Fill in the blanks if it seems too good to be true,___.
Christian Wilson
I bet we could redevelop the SDCS using constitutional neural networks to figure out the optimal coding scheme for transmitting say VP8/9 encoded data.
Adrian Brown
Damn spell checker, should be convolutional neural networks.
Ian Reed
It would work for something like South Park, but the library would be way bigger than 370 MB.
Matthew Campbell
Change that webm to Linux or open source.
Ayden Bell
=Fuck off
Ian Green
Nobody knows except the guys with the floppy.
Carter Ward
I thought about a random generator working off a seed, but then you'd need to be able to encode it, knowing from a given data what seed to feed to the random generator. Maybe a seed that's also the hash for the frame.
Isaac Reed
When I was reading the first bit of the the text in OP's link I got visions of a system that might work:
All files are hashes with a simple algorithm. No two files can ever have the same hash. All hashes contain letters from ASCII so 0xFF possible letters for a hash. Hash is divisible by the length of the original file so that each part of the file can be equally distributed and hashed out. Each part of the file results in a character that is added into the hash. When decompressing, it's raw bruteforce. A method for storing files long term and reduce their size.
Second method (which the text describes): The uncompress API comes with a 10,100,1000 MB table with indexed keys for certain combinations. Each Unicode character is mapped on the indexed key. Uncompressing a file is the method of matching keys to patterns.
Since the key is stored locally and is predownloaded for many other projects, each compressed file in the future saves tons of GB of data for compressed archives.
Second method expanded: Why download 1000 MB of indexed keys? Why not make a program that is less than 1mb, compile and make the indexed keys based on a pattern? Computation time is slower than downloading over 1GB connection, but allows everyone access to the key table.
Magic.
Jackson Green
What happens when you have 2^n+1 files and a hash with n bits? Computing the file would work, but then you need a formula for the file. Finding a small formula for it is, more often than not, very expensive. Maybe KGB Archiver does something similar, since I remember reading it takes forever to compress.
Lincoln Clark
I was in another thread talking about ternary computers.
What if you could convert binary into Ternary(base 3) or Vigesimal(base 20) then convert it on the fly back to binary when you need to use it on a computer.
πfs is a revolutionary new file system that, instead of wasting space storing your data on your hard drive, stores your data in π! You'll never run out of space again - π holds every file that could possibly exist! They said 100% compression was impossible? You're looking at it!
Leo Lewis
That's kind of my idea. You'd need an algorithm to find each sequence in pi cached for quick access, which would take up even more space. It would take a long time for computers to utilize the system efficiently and eventually, the address ranges of sequences might be longer than the actual sequence itself.
Adam Carter
even supposing that pi contained every possible sequence of digits (which has not been proven) the bignum that would be required to store your offset might be larger than the data you'd want at that offset.
Jacob Wood
pls email [email protected]/* */ if you're a cat named sakamoto and want a cute furret to lick your paws Photons an electrons are both elementary particles. One does not exist inside the other.
Very elegant, intuitive explanation. Unfortunately its a complete fantasy. An electrons mass is constant.
Matthew Campbell
pls email [email protected]/* */ if you're a cat named sakamoto and want a cute furret to lick your paws
most of them unless they have a real job. porn shoots are few and far between. you can't live on them. so most guys strip/dance/escort to pay the bills.
Sounds like he has a bunch of random video clips, totaling say 4 GB, already distributed to everyone. The 512 KB part or whatever is then "compressed" in reference to these. I don't buy it, the difference between two 1GB movies isn't just 512 KB unless you're watching some really derivative stuff. Anyway, the extreme form of this is to just distribute a 100 TB library of every movie ever, and then you can "compress" movies in just a few bytes since all you need is the name. That should give you an idea. In fact, this manner of library compression already exists and is widely used, it's called torrents.
As for impressing Philips execs, who knows. Execs are morons, they believe all sorts of retarded bullshit if you spin it right, see TED and dotcom crash. Since the algo is lost, kind of pointless to spend much time on it, but sounds like a viable but not too practical method.
One comment says that this exploits the analog signal or something. That sounds like a sort of pifs to me, or maybe some exotic analog storage tech that happens to work better for video, built in the latter case you can't really measure it with bytes (a digital unit).
Anyway, in practice compression comes down to finding patterns, meaning repetition. Typically the repetition is within files. But if you have a bunch of files that have no pattern vs. themselves, but are all somehow related to each other, compressing the whole thing in one go might save you more spaces than individually. If this was your angle, you could analyze several petabytes of video files and create a Markov model, which is distributed to all end users, and then lossily compress everything in relation to that. I bet you could shave off quite a few MBs that way if you don't count the distributed Markov model since the user "would have it anyway". But like I said, my intuition is that movies aren't all that similar and you won't achieve much, especially compressed video nowadays is probably indistinguishable from random data, and compressed versions of different movies are probably completely uncorrelated. But for lossy encoding of uncompressed data? Could be pretty neat toy actually (impractical these days since bandwidth is often less scarce than storage).
If anyone is interested, we could try writing a proof of concept for something like this for say English novels or photos of people's faces. I bet it wouldn't be too hard.
Zachary Brown
maybe he found out how to use the undefined space in a digital signal.
Justin Moore
but sloot wasn't compression.
Julian Morris
I played with an idea similar to this a few months ago. Mine was to describe the SUM of the file via a mathematical equation.
To give an example, imagine that all the bits of a file SUM to: 400^400+12 ... then the file can be described by the above formula. I'm still not sure how feasible this would be.
Another thing I played with: normally each bit only increases by 2^Length. But, I thought if the process was reversed and we increment FASTER at the start and then taper down as we go it might work to compress things.
And then I realized that this is actually what Binary files do already, albeit in reverse... yep, I was dumb :(
Justin Young
you just rediscovered something, keep working at it.
Juan Nguyen
Damn, a friend and I actually had a similar idea while stoned. We should implement it.
Blake Butler
...
Cameron Rogers
I've thought about this too, but you are always trading computation time for storage. You would be basically brute-forcing the seed/hash, which is not gonna happen in the universe's lifetime.
And if you could, then the hash would suffer from a lot of collisions, which would make it unusable for this purpose.
Jordan Gomez
Imagine a system that upon receiving a file, caches it in 1MB segments (or any other size) and indexes them by hash. Then when another transfer is started, checks with the other end if every particular segment is cached, and uses it instead of having it sent again. Over time transfers would get shorter and shorter (and the cache larger and larger).
I'm sure something like this already exists or it is unpractical. Otherwise it would be used everywhere already.
Angel Sullivan
Before the transfer begins, the hashes are transferred. Then the recipient looks through them and requests the pieces of the file that are not cached, and caches them for the next time. Rinse and repeat. When the transfer is finished, do a CRC or MD5 check of the whole file or something to make sure everything went well, if not, request the cached pieces, or request them one by one, checking as we go, whatever is faster.
This sounds like it would be really useful. What am I missing here?
Thomas Mitchell
Isn't that how BitTorrent works?
Connor Hernandez
then the goddamned kikes are suppressing it because shekels.
Kayden Davis
my understanding is the sloot system used those random images to rebuild images. like if you have a lot of ripped up construction paper, you can specify where to place them and rebuild the original image by positioning and overlaying them correctly.
i've looked at this before. it's probably possible to make a lossy compression scheme using this approach, for video. it would've been acceptable for the video quality at the time.
but for random data it can't losslessly compress to a smaller size due to kolmogorov complexity.
Parker Jenkins
The bigger problem with this is that the probability of using a seed and offset that are smaller than the data you're encoding alone is almost nonexistant. The seed+offset pair would probably be far larger than the file itself (especially if offset is 0).