Some words about LZ77

It somewhat amuses me how people discover they can use deflate to estimate text complexity or similarity. The reason for that is that this compression algorithm was essentially a by-product of efforts to estimate the string complexity. The original paper states it outright that the compression algorithm is “an adaptation of a simple copying procedure discussed recently” (referring to A. Lempel and J. Ziv, “On the complexity of finite sequences,” IEEE Trans. Inform. Theory, vol. IT-22, pp. 75-81, Jan. 1976). Apparently it went the full circle, proving that their original approach to complexity was right and that the authors (who did more work in that field than if field of compression) should be celebrated for this subject as well.

Leave a Reply