Object storage takes space.
Sometimes the object is large and space is at a premium.
Optimizing access for usage versus storage for reduced cost requires representational transformation.
One type of such is called compression.
Many different methods of compression exist, almost all requiring their own evaluator.
These each individually may be simple to invoke, but become extensively unwieldy as more of them come into use.
In addition to finding a very good compression technique, it would also be useful to have a programmatic interface to tabularize these methods (similar to Jensens Device, thunking). Tuck/untuck for compression and redictuless for representation conformation.
See also “The specification, factoring, defaulting, and overriding of attributes”.
Define, unify , name, specify, enumerate/explicitly list, default, s? tabularize , override/respecify, extensibly(dunseldtore)
2 Table of Contents
5.1 Historical Analysis
184.108.40.206 Archives & Archivers
220.127.116.11.1.1 In a file system
18.104.22.168.1.2 I a directory
22.214.171.124.1.3 In a file
126.96.36.199.1.2 Primitive identity
188.8.131.52.1.3 Vampt: Value, attribute, Mode, Property, Type
184.108.40.206.1.4 Enumeration: Union of primitives
220.127.116.11.1.5 Union: One alternative of many by tag
18.104.22.168.1.6 Structure: List of dissimilars fixed
22.214.171.124.1.7 Set, list, array: Multiple similar, dynamic
126.96.36.199.2.1 Internal versus external
188.8.131.52.2.2 Transportable versus transfixed
184.108.40.206.2.3 Bound versus unbound
220.127.116.11.2.4 Machine versus human
18.104.22.168.2.5 Binary versus text
22.214.171.124.126.96.36.199 Name assign value
188.8.131.52.1 Huffman byte to bit: Pack
184.108.40.206.2 LZ, LZW
220.127.116.11 Archives & Archivers
18.104.22.168.1.1 In a file system
22.214.171.124.1.2 I a directory
126.96.36.199.1.3 In a file
188.8.131.52.1.2 Primitive identity
184.108.40.206.1.3 Vampt: Value, attribute, Mode, Property, Type
220.127.116.11.1.4 Enumeration: Union of primitives
18.104.22.168.1.5 Union: One alternative of many by tag
22.214.171.124.1.6 Structure: List of dissimilars fixed
126.96.36.199.1.7 Set, list, array: Multiple similar, dynamic
188.8.131.52.2.1 Internal versus external
184.108.40.206.2.2 Transportable versus transfixed
220.127.116.11.2.3 Bound versus unbound
18.104.22.168.2.4 Machine versus human
22.214.171.124.2.5 Binary versus text
You can tell a lot by looking at the data structure of the resulting data structure. Each part of the structure gives clues on how to improve it.
1) The data structure of pack is a list of self terminating variable length bit codes, either recalculated at each step (as in a data stream) or calculated maximally across an entire file.
2) Differing codes at each step for each new character encode, and the rules encoding them may vary.
3) The latter will build a single table and prefix it to the data structure.
4) Note that a fixed table for an entire file will only be an average optimal estimate, in that a different rule for populating the table might result in denser coding.
5) The dynamic one will start eith varying code, be loose in the assignation of the codes, and may nor achieve optimal compression until enough of a sample is built up.
6) Different rules for how to reassign the weights to the bit strings may work better or wore in either case (ful file or stream) depending on how the data actually changes from packet to packet. Having the entire file to analyze can result in finding optimal points to insert encoding changes: either according to a fixed rule, or by explicit dictionary edit.
LZW, as does the earlier LZ, builds up a dictionary of strings as it goes; but it also could use a full file analysis in order to prefix a dictionary to the file, or have an assumed one for various contexts.
Varying rules for changing the entries in this table could also be applied for higher density, similar to Huffman encoding in pack.
When the data structure itself is examined, the most noticeable condition is that the individual elements have a very dense part (code) and a very sparse part (datum).
The result is that essentially half (8/20) bits of the compressed file is approximately not compressed at all.
As such, for 50% compression, there should be about another 20% (50% of 8/20) available if the datum part was compressed. This would result in a 60% (30%/50%) final size reduction.
126.96.36.199.2.1 Ordered list of
188.8.131.52.2.2 Compression element of
184.108.40.206.2.3 Structure of
220.127.116.11.2.4 which in itself is a dictionary of
18.104.22.168.2.5 prefixed fixed dictionary and
22.214.171.124.2.6 accretion growth list of
126.96.36.199.2.7 compression element of
188.8.131.52.2.8 structure of
184.108.40.206.2.8.1 fixed assigned code
220.127.116.11.2.8.2 fixed datum
5.2 New methods
18.104.22.168.1 DeStufdt: Data extended system transform unified file/directory trees
22.214.171.124.2 Mafs: Media archive file systems
126.96.36.199.3 Vaptocs: volumes and partitions table of contents systems
Most programs have the option of presenting the data as seen or keeping it in an internal format.
Changin back and forth between them should essentially be a completely transparent operation.
This can be done, as in the “dotz” and “MVFS” NFS-compliant file systems, by tying a front end program to the file operations in order to interpret the sequences involved.
This introduces new errors (“cannot convert properly” and similar), but has the side effect of allowing canonical representations to be maintained with minimal cost overhead.
What is needed is a systematic way to state these variant representations, and have a single interface of common semantics to implement this.
188.8.131.52.1 File systems
184.108.40.206.1.1 Typed files
220.127.116.11.1.1.2 Magic numbers
18.104.22.168.1.1.3 Full data type systems
22.214.171.124.1.2 Symbolic links
A symbolic link by default points to the file to link to by either a relative or absolute path.
What could actually be put in there is an arbitrary text program to provide the converter on open for a specific file, even one that distinguishes which program is opening the file.
Context dependent files change what they look lke or even what they contain on the basis of external data, such as variables in the process environment or what type of system the file is on.
126.96.36.199.2.1 Yar: yet another representor
Or Roy’s ridiculously recursive reformatting representor (r4).
188.8.131.52.3 Extensible lazy evaluation
184.108.40.206.2 Holey (Holographic object lazy evaluation ylem(inator))
Scattering the information uniformly across every bit is both an effective cryptographic technique as well as an effective compression technique.
All ya gotta do is pick a good evaluator.
Using the input data as a transform to find the coefficients of a polynomial in N space to rebuild the input data based on those coefficients, is one method. There are others.
220.127.116.11.3 Extensible lazy evaluation
The basic data structure is one I call redictuless:
Recursively extensible data inference compression tree using lazy evaluation symbol strings.
The basic idea is that the compression codes themselves recursively refer to other codes in the list.
If the table is built up by scanning the entire file, then the codes themselves may recursively refer to themselves (if careful construction is used to provide a terminator for such).
If done by stream oriented scanning, the codes would use the ones already found to add new ones to the list.
If all elements of each code have approximately the same data density, then the index of codes will be at maximum compression levels in all parts of the file or data stream
18.104.22.168.3.1 Ordered list of
22.214.171.124.3.2 Compression element of
126.96.36.199.3.3 Union of
188.8.131.52.3.4 Method/s, which are
184.108.40.206.3.4.1 Termination assessor
If a recursive encoding is allowed, then there must be a method of terminating the recursion.
The top level codes can be used to hold a parameter block for the assessor if desired.
220.127.116.11.18.104.22.168 Level count
A fixed number (do not recurse more than three times) can be used.
Or a variable one.
22.214.171.124.126.96.36.199 DETECT: Data evaluation Test Extended Context Termination
A ful functional routine may be used as a standard item, with additional ones added in.
The method may be specified by an explicit fixed code, or by an extensible startup or runtime extended list of such assessors.
The parameters may be null, at the level of the assessor, or picked from the surrounding context.
With any dynamic recursive test for end of expansion (as in any macro system (cpp, m4, or the lambda calculus, for example)), some form of safety must be convoked to stop runaways on improper execution or structure encoding.
If the codes are built up as simple macros with other code references, then expansion may be terminated via macors that cease to expand, and no separate assessor may be needd; although parameters would still need to be supplied (fixed, stated, context implied, or explicitly stated)
188.8.131.52.3.4.2 (optional, implied) code/assessor
184.108.40.206.3.4.3 (optional, implied) code/parameter)
220.127.116.11.3.4.4 Code/data, which is one of
18.104.22.168.22.214.171.124 Structure of
126.96.36.199.188.8.131.52 List of
Adding a carefully considered method of systematizing ad hoc usages would gain tremendously.