Loading static files from disk

Authorizing and Mapping Urls and Domains

By default PageSpeed loads sub-resources via an HTTP fetch. It would be faster to load sub-resources directly from the filesystem, however this may not be safe to do because the sub-resources may be dynamically generated or the sub-resources may not be stored on the same server.

However, you can explicitly tell PageSpeed to load static sub-resources from disk by using the LoadFromFile directive. For example:

pagespeed LoadFromFile "http://www.example.com/static/" "c:\www\static/"

tells PageSpeed to load all resources whose URLs start with http://www.example.com/static/ from the filesystem under c:\www\static/. For example, http://www.example.com/static/images/foo.png will be loaded from the file c:\www\static/images/foo.png. However, http://www.example.com/bar.jpg will still be fetched using HTTP.

If you need more sophisticated prefix-matching behavior, you can use the LoadFromFileMatch directive, which supports RE2-formatted regular expressions. (Note that this is not the same format as the wildcards used above and elsewhere in PageSpeed.) For example:

pagespeed LoadFromFileMatch "^https?://example.com/~([^/]*)/static/" "c:\www\static/\\1"

Will load http://example.com/~pat/static/cat.jpg from c:\www\static/pat/cat.jpg, http://example.com/~sam/static/images/dog.jpg from c:\www\static/sam/images/dog.jpg, and https://example.com/~al/static/css/ie from c:\www\static/al/css/ie. The resource http://example.com/~pat/images/static/puppy.gif, however, would not be matched by this directive and would be fetched using HTTP.

Because PageSpeed is loading the files directly from the filesystem, no custom headers will be set.

You can also use the LoadFromFile directive to load HTTPS resources which would not be otherwise fetchable directly. For example:

pagespeed LoadFromFile "https://www.example.com/static/" "c:\www\static/";

The filesystem path must be an absolute path.

You can specify multiple LoadFromFile associations in configuration files. Note that large numbers of such directives may impact performance.

If the sub-resource cannot be loaded from file in the directory specified, the sub-request will fail (rather than fall back to HTTP fetch). Part of the reason for this is to indicate a configuration error more clearly.

As an added benefit. If resources are loaded from file, the rewritten versions will be updated immediately when you change the associated file. Resources loaded via normal HTTP fetches are refreshed only when they expire from the cache (by default every 5 minutes). Therefore, the rewritten versions are only updated as often as the cache is refreshed. Resources loaded from file are not subject to caching behavior because they are accessed directly from the filesystem for every request for the rewritten version.

See also MapOriginDomain.

This directive can not be use in location-specific configuration sections.

Limiting Direct Loading

A mapping set up with LoadFromFile allows filesystem loading for anything it matches. If you have directories or file types that cannot be loaded directly from the filesystem, LoadFromFileRule lets you add fine-grained rules to control which files will be loaded directly and which will fall back to the standard process, over HTTP.

When given a URL PageSpeed first determines whether any LoadFromFile mappings apply. If one does, it calculates the mapped filename and checks for applicable LoadFromFileRules. Considering rules in the reverse order of definition, it takes the first applicable one and uses that to determine whether to load from file or fall back to HTTP.

Some examples may be helpful. Consider a website that is entirely static content except for a /cgi-bin directory:

c:\www\index.html c:\www\css\style.css c:\www\gfx\image.png c:\www\bin\webapp.dll

While most of the site can be loaded directly from the filesystem, webapp.dll and web.config are files that need to be interpreted before serving -- or not served at all! Adding a rule disallowing the /bin directory tells us to fall back to HTTP appropriately:

pagespeed LoadFromFile http://example.com/ c:\www\ pagespeed LoadFromFileRule Disallow c:\www\bin

The LoadFromFileRule directive takes two arguments. The first must be either Allow or Disallow while the second is a prefix that specifies which filesystem paths it should apply to. Because the default is to allow loading from the filesystem for all paths listed in any LoadFromFile statement, most of the time you will be using Disallow to turn off filesystem loading for some subset of those paths. You would use Allow only after a Disallow that was overly general.

Not all sites are well suited for prefix-based control. Consider a site with aspx files mixed in with ordinary static files:

c:\www\index.html c:\www\webmail.aspx c:\www\webmail.css c:\www\blog/index.aspx c:\www\blog/header.png c:\www\blog/blog.css

Blacklisting just the .aspx files so they fall back to an HTTP fetch allows everything else to be loaded directly from the filesystem:

pagespeed LoadFromFile http://example.com/ c:\www\; pagespeed LoadFromFileRuleMatch Disallow \.aspx;

The LoadFromFileRuleMatch directive also takes two arguments. The first is either Allow or Disallow and functions just like for LoadFromFileRule above. The second argument, however, is a RE2-format regular expression instead of a file prefix. Remember to escape characters that have special meaning in regular expressions. For example, if instead of \.aspx$ we had simply .aspx$ then a file named example.notphp would still be forced to load over HTTP because "." is special syntax for "match any single character".

Consider a site with the opposite problem: a few file types can be reliably loaded from file but the rest need interpretation first. For example:

c:\www\index.html c:\www\site.css c:\www\script-using-ssi.js c:\www\generate-image.ashx c:\www\

In this site generate-image.ashx needs to be interpreted to make images. The only resources on the site that are generally safe to load are .css ones. By first blacklisting everything and then whitelisting only the .css files, we can make PageSpeed do this:

pagespeed LoadFromFile http://example.com/ c:\www\ pagespeed LoadFromFileRuleMatch disallow .* pagespeed LoadFromFileRuleMatch allow \.css$

This works because order is significant: later rules take precedence over earlier ones.