Join Free! | Login    
   Popular! Books, Stories, Articles, Poetry
Where Authors and Readers come together!


Featured Authors:  Nickolaus Pacione, iMargaret Doner, iKeith Dyne, iRobert Orfali, iOdin Roark, iSheri Hoff, iLaurie Conrad, i

  Home > Science > Articles Popular: Books, Stories, Articles, Poetry     

roy andrea crabtree

· + Follow Me
· Contact Me
· Articles
· Poetry
· Stories
· Blog
· 57 Titles
· 39 Reviews
· Save to My Library
· Share with Friends!
Member Since: Dec, 2004

roy andrea crabtree, click here to update your pages on AuthorsDen.

Featured Book
Food for thought
by Antoine Raphael

It's an English version of "Matière à penser"..  
BookAds by Silver
Gold and Platinum Members

CARDS: Compressiong, Archiving, Representing Data Structures
by roy andrea crabtree   
Rated "G" by the Author.
Last edited: Sunday, October 08, 2006
Posted: Saturday, October 07, 2006

  Print   Save    Follow    Share 

Recent articles by
roy andrea crabtree

GNUBG: NNP DB Cheats Honestly
Rcoketeer Love Theme: Where you go my Love
Maximum Parallelism: Order Without Sequence
Resume, bulletted 2006/11/20
In Honor of Ed Bradley: Where Have Ed & 60 Minutes Failed?
Viral attacks: what people can see after the fact.
Restrospective analysis of emotionally & politically controversial events
           >> View all
      View this Article

Many methods exist, and a systematic approach to integrating all of them would provide many bnenefits. Plus a few techniques of my own. All rights Reserved. Work in Progress.

1         Abstract

Object storage takes space. 

Sometimes the object is large and space is at a premium.

Optimizing access for usage versus storage for reduced cost requires representational transformation.

One type of such is called compression.

Many different methods of compression exist, almost all requiring their own evaluator.

These each individually may be simple to invoke, but become extensively unwieldy as more of them come into use.

In addition to finding a very good compression technique, it would also be useful to have a programmatic interface to tabularize these methods (similar to Jensens Device, thunking). Tuck/untuck for compression and redictuless for representation conformation.

See also “The specification, factoring, defaulting, and overriding of attributes”.

Define, unify  , name, specify, enumerate/explicitly list, default, s?  tabularize , override/respecify, extensibly(dunseldtore)

2         Table of Contents

3         Introduction

4         Overview

5         Main

5.1      Historical Analysis

5.1.1      Code/Programs  Archives & Archivers   Directory     In a file system     I a directory     In a file   Ar   Tar   Cpio   Pax   Shar   .EXE   zip  Representation   Structure     Name     Primitive identity     Vampt: Value, attribute, Mode, Property, Type     Enumeration: Union of primitives     Union: One alternative of many by tag     Structure: List of dissimilars fixed     Set, list, array: Multiple similar,  dynamic   Data     Internal versus external     Transportable versus transfixed     Bound versus unbound     Machine versus human     Binary versus text   Binary   Text  Name assign value  Compressors   Huffman byte to bit: Pack   LZ, LZW     Compress     Gzip     Bzip

5.1.2      Data/Structure  Archives & Archivers   Directory     In a file system     I a directory     In a file   Ar   Tar   Cpio   Pax  Representation   Structure     Name     Primitive identity     Vampt: Value, attribute, Mode, Property, Type     Enumeration: Union of primitives     Union: One alternative of many by tag     Structure: List of dissimilars fixed     Set, list, array: Multiple similar,  dynamic   Data     Internal versus external     Transportable versus transfixed     Bound versus unbound     Machine versus human     Binary versus text   Binary   Text  Compression

You can tell a lot by looking at the data structure of the resulting data structure.   Each part of the structure gives clues on how to improve it.   Pack

1)      The data structure of pack is a list of self terminating variable length bit codes, either recalculated at each step (as in a data stream) or calculated maximally across an entire file.

2)      Differing codes at each step for each new character encode, and the rules encoding them may vary.

3)      The latter will build a single table and prefix it to the data structure.

4)      Note that a fixed table for an entire file will only be an average optimal estimate, in that a different rule for populating the table might result in denser coding.

5)      The dynamic one will start eith varying code, be loose in the assignation of the codes, and may nor achieve optimal compression until enough of a sample is built up.

6)      Different rules for how to reassign the weights to the bit strings may work better or wore in either case (ful file or stream) depending on how the data actually changes from packet to packet.  Having the entire file to analyze can result in finding optimal points to insert encoding changes: either according to a fixed rule, or by explicit dictionary edit.   LZW

LZW, as does the earlier LZ, builds up a dictionary of strings as it goes; but it also could use a full file analysis in order to prefix a dictionary to the file, or have an assumed one for various contexts.

Varying rules for changing the entries in this table could also be applied for higher density, similar to Huffman encoding in pack.

When the data structure itself is examined, the most noticeable condition is that the individual elements have a very dense part (code) and a very sparse part (datum).

The result is that essentially half (8/20) bits of the compressed file is approximately not compressed at all.

As such, for 50% compression, there should be about another 20% (50% of 8/20) available if the datum part was compressed.  This would result in a 60% (30%/50%) final size reduction.     Ordered list of     Compression element of     Structure of   code   datum     which in itself is a dictionary of     prefixed fixed dictionary and     accretion growth list of     compression element of     structure of   fixed assigned code   fixed datum

5.2      New methods

5.2.1      Code/Programs  Archivers   DeStufdt: Data extended system transform unified file/directory trees   Mafs: Media archive file systems   Vaptocs:  volumes and partitions table of contents systems  Representation

Most programs have the option of presenting the data as seen or keeping it in an internal format.

Changin back and forth between them should essentially be a completely transparent operation.

This can be done, as in the “dotz” and “MVFS” NFS-compliant file systems, by tying a front end program to the file operations in order to interpret the sequences involved.

This introduces new errors (“cannot convert properly” and similar), but has the side effect of allowing canonical representations to be maintained with minimal cost overhead.

What is needed is a systematic way to state these variant representations, and have a single interface of common semantics to implement this.   File systems     Typed files   Extensions   Magic numbers   Full data type systems     Symbolic links

A symbolic link by default points to the file to link to by either a relative or absolute path.

What could actually be put in there is an arbitrary text program to provide the converter on open for a specific file, even one that distinguishes which program is opening the file.     CDFs

Context dependent files change what they look lke or even what they contain on the basis of external data, such as variables in the process environment or what type of system the file is on.   Programs     Yar: yet another representor

Or Roy’s ridiculously recursive reformatting representor (r4).  Compressors   Tuck/untuck   Holographic   Extensible lazy evaluation

5.2.2      Data/Structure  Archivers  Representation  Compressors   Tuck/untuck   Holey (Holographic object lazy evaluation ylem(inator))

Scattering the information uniformly across every bit is both an effective cryptographic technique as well as an effective compression technique.

All ya gotta do is pick a good evaluator.

Using the input data as a transform to find the coefficients of a polynomial in N space to rebuild the input data based on those coefficients, is one method.  There are others.   Extensible lazy evaluation

The basic data structure is one I call redictuless:

Recursively extensible data inference compression tree using lazy evaluation symbol strings.

The basic idea is that the compression codes themselves recursively refer to other codes in the list.

If the table is built up by scanning the entire file, then the codes themselves may recursively refer to themselves (if careful construction is used to provide a terminator for such).

If done by stream oriented scanning, the codes would use the ones already found to add new ones to the list.

If all elements of each code have approximately the same data density, then the index of codes will be at maximum compression levels in all parts of the file or data stream     Ordered list of     Compression element of     Union of     Method/s, which are   Termination assessor

If a recursive encoding is allowed, then there must be a method of terminating the recursion.  Parameter

The top level codes can be used to hold a parameter block for the assessor if desired.  Level count

A fixed number (do not recurse more than three times) can be used.

Or a variable one.  DETECT: Data evaluation Test Extended Context Termination

A ful functional routine may be used as a standard item, with additional ones added in. Method

The method may be specified by an explicit fixed code, or by an extensible startup or runtime extended list of such assessors. Parameters

The parameters may be null, at the level of the assessor, or picked from the surrounding context. Safeties

With any dynamic recursive test for end of expansion (as in any macro system (cpp, m4, or the lambda calculus, for example)), some form of safety must be convoked to stop runaways on improper execution or structure encoding.

If the codes are built up as simple macros with other code references, then expansion may be terminated via macors that cease to expand, and no separate assessor may be needd; although parameters would still need to be supplied (fixed, stated, context implied, or explicitly stated)   (optional, implied) code/assessor   (optional, implied) code/parameter)   Code/data, which is one of  Structure of code code  List of Code  Etc.


6         Summary

7         Closing

Adding a carefully considered method of systematizing ad hoc usages would gain tremendously.





Want to review or comment on this article?
Click here to login!

Need a FREE Reader Membership?
Click here for your Membership!

Popular Science Articles
  1. Predicate of Reality: Wheeler's Paradox Co
  2. The Pale Blue Dot
  3. Einstein and Bohr were men of faith
  4. Is This a Purpose-Driven Universe?
  5. Phobos-Grunt Re-entry
  6. The Speed of Light
  7. Universe in a Glass of Wine
  8. Life on Earth: Is This a Dumbed-Down Exist
  9. New Science Perspectives on Ghosts and Oth
  10. China and the Barbarians

Virus X by Frank Ryan

Tells the story of where viruses, such as Ebola and HIV, come from and why they behave as they do. This book changed beliefs in evolutionary biology...  
BookAds by Silver, Gold and Platinum Members

Darwin's Blind Spot by Frank Ryan

A different way of looking at evolution, now increasingly espoused in leading scientific journals, universities and presentations...  
BookAds by Silver, Gold and Platinum Members

Authors alphabetically: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Featured Authors | New to AuthorsDen? | Add AuthorsDen to your Site
Share AD with your friends | Need Help? | About us

Problem with this page?   Report it to AuthorsDen
© AuthorsDen, Inc. All rights reserved.