cutSpace
Cherry Blossoms

Introduction

I am interested in obfuscation with article「Obfuscation of the code prohibits take-out!」, and when I look it up, it seems that it is used quite a lot.
Therefore, I decided to challenge the language processing tool as a highlight of my skill because the result of the parser affects the beauty.

Creating a tool

We started development by dividing into modules as shown below,
but since there are many cases that need to be dealt with and problems are likely to occur, we integrated (2-3) and processed them all at once.
(When I compressed the C language with AWK in 「√2 - program contest」, there was no problem due to the restrictions ^^);

  1. Join rows : \(backslash)
  2. String parsing : "String"、'String'、`back quote` (Template literal)、/regex/ (API research required as it conflicts with division)、Of course, the parts other than the HTML tags do not have the concept of strings.
  3. Delete comment : <!-- -->、/* */、 // Line comment
  4. Splitting and joining lines(As a final task, I refactored the code and integrated (2-4))
  5. Blank compression (HTML also compresses tag strings)、Script、CSS(Remove the semicolon immediately before the '}')
    (It's simple to do, but with (2-4) it's the largest module, driven only by regular expressions and parameters)

Working with three languages was difficult regardless of appearance, but when you look at it after completion, it is exhilarating and nice.

Deliverables will be distributed in open source. (Ideal for code reading. 7 modules, 1k step)

Install

  1. Download Java(.Zip that does not pollute the environment is recommended, multiple Java can be installed)「Java Downloads」
  2. Download cutSpace「cutSpace - HTML圧縮ツール (Compression tool)」 (Attaching command)
  3. Set the JAVAHOME variable of makefile in the cutSpace folder to Java home path.

Execution

If you are a Windows user, please choose one of the following. (By the way, the code is UTF-8 \n)
  1. Open the cutSpace folder in Explorer, type in the breadcrumbs at the top, and type at the command prompt.
  2. Create a shortcut for make.cmd and add the target etc. of to the "property link destination".
  3. Create a batch file.

Summary

About HTML guards.
  1. Disable image storage
    To disable "Save image" and "Drag and drop", set "POINTER-EVENTS: NONE;" in CSS for the image tag.
  2. Disable source display
    To disable the right button "Show Page Source" and "Screenshot", add「STYLE = "MARGIN: 0;"」「onContextMenu = "return false;"」to the <body> tag.
    ※ However, if you are using a template on your blog, you cannot use <body> In addition, it is not recommended because it is not easy to use.
  3. Also, the screenshot function of the OS cannot be touched.
  4. In Chrome, Edge, Firefox, etc., type "VIEW-SOURCE:" at the beginning of the URL to display the source code (Menu-> Other Tools-> View Page Source, + is also possible)
  5. The purpose of obfuscation of html (blank / comment compression) is not so much as obfuscation, but speeding up by visibility and weight reduction.
  6. Obfuscation of local variables and function names in Script.
  7. Encryption of Script character constants.

I checked「THE WHITE HOUSE」as a reference HP.
Controls specific image downloads (1) and compresses scripts (5).
(The code is complicated by appearance and makes heavy use of SVG. Also, I saw a line break in the SVG tag string for the first time ^^);
It is also famous for its HTML comments that provide job listings. (10th line)

At the end

During development, I came up with the brute force of removing the whitespace around the "string" in the tag.
<tag foo="" bar=""> → <tag foo=""bar="">
The browser works fine, but I've given up because the W3C checker gives an error (not a warning), but I implemented it. (Option -x)

If you like the tool, please put ☆ in the comment title field. (△ Nickname, text, password)

Bonus

## gawk -f cutSpace.awk <input.html> > <output.html>
BEGIN { IGNORECASE = 1
}
    {
    if (!pre)
        sub(/^[ \t]+/, "")
    sub(/[ \t]+$/, "")
    if (match($0, "</pre>")) pre = 0
    nl = pre ? "<br>" : " "
    if (match($0, "<pre")) pre = 1
    if ("" != $0) {
        printf "%s%s", $0, nl
        print $0 > "/dev/stderr"
    }
}
※ JavaScript ';' omission is not supported.
「Blog top」 2022.4.22