题意:
Embedding all the external resources of an HTML page into a single file using javascript in the browser
使用 JavaScript 在浏览器中将 HTML 页面所有外部资源嵌入到一个文件中
问题背景:
As you all know, external resources, like images, can be embedded into the html file using base64 encoding:
正如大家所知道的,像图片这样的外部资源可以通过 base64 编码嵌入到 HTML 文件中:
<img src="..." />
I'm looking for a pure browser-based javascript way to traverse an html page and embed all the external resources into the file so when I say $("html").html()
, it returns all the page's contents. Even including its external resources.
我正在寻找一种纯浏览器-based的 JavaScript 方法,遍历一个 HTML 页面并将所有外部资源嵌入到文件中,这样当我执行 $("html").html()
时,它能返回页面的所有内容,甚至包括其外部资源。
Just so it makes sense, I'm trying to download web pages into single files using a headless browser on my server.
为了让它更有意义,我正在尝试使用无头浏览器在我的服务器上将网页下载为单个文件。
问题解决:
There are tools out there to do that. Examples: 市面上有一些工具可以做到这一点。举例来说:
- https://github.com/remy/inliner
- https://github.com/jgallen23/grunt-inline-css
- https://github.com/ceee/grunt-datauri
While there are benefits to this approach, remember that a page visited more than once, or site with multiple pages with same JS/CSS files will enjoy client (browser) side caching.
虽然这种方法有其优点,但请记住,访问超过一次的页面或有多个页面使用相同 JS/CSS 文件的网站将享受客户端(浏览器)缓存。