[Hpr] Static Site Generators - NOT a flat file CMS

Ken Fallon ken at fallon.ie
Sat May 11 06:07:09 PDT 2019


Hi Carl,

Just press record and read this as is. This is an episode in itself, and
that is more likely to solicit feedback. Which you can cover in your
third, show. (Your second been "how I got into tech " by way of
introduction) (See what I did there :) )

Great idea by the way, and I would like to have a zip file of the
directory structure to try it out.

Thanks.

-- 
Regards,

Ken Fallon
http://kenfallon.com
http://hackerpublicradio.org/correspondents.php?hostid=30

On 2019-05-10 04:59, Carl Chave wrote:
> Hello HPR Community!  I'm Carl, but I've been using the name "sodface"
> online since about 1996.  I've been a computer hobbyist and an IT
> worker for many years but I must admit that I'd become a bit
> disinterested on the hobby side over the last few years but that
> recently turned around when I decided to put Fedora on all my home
> machines - it's really lit the fire again and I'm pretty happy about
> that.  I've been listening to HPR and other linux podcasts and I plan
> on recording and submitting an episode for HPR soon.  Saying that
> before I do it is probably a jinx but I sort of wanted to reply to
> this post because it's relevant to the episode I had planned and I
> figured this email might also serve as my episode outline.
> Additionally, maybe I'll get some feedback that will help shape the
> episode or perhaps convince me not to bother with it at all!
> 
> But to the point:
> 
> I had an idea sometime around 2007 or so when I was a moderator and
> "webmaster" of an online hardware forum to make my own template driven
> site generator.  At that time I was using a typical LAMP stack and did
> my first proof of concept work with those tools.  Just recently, I
> discovered the althttpd web server, which is a 44k compiled binary
> built from a single file of c code written by D. Richard Hipp, author
> of Fossil and SQLite and much more.  He actually uses it to serve the
> Fossil and SQLite websites so far as I can tell, so it's been proved
> in a fairly high traffic context.  I started experimenting with it and
> it made me revisit my template idea again and reshape it to
> incorporate some of the features that althttpd brings to the table.
> 
> Yawn, I know, the last thing the world needs is another website
> generator / framework / CMS etc. but I don't really consider my
> implementation any of those, it's really more just an idea and an
> early stage proof of concept.  I'm calling it "Symplate", for symbolic
> linked templates.  To give it a name feels a bit like I think it's
> more than it is but it's easier and more fun to call it something, and
> maybe giving it a name will help it to become something after all
> these years.
> 
> My design goals were roughly:
> -- make it simple but flexible
> -- make it feel like Unix / Linux
> -- use familiar basic Linux utils
> -- don't introduce anything new, like special syntax
> -- stick to writing in plain html for the first iterations
> -- make it portable, I'm trying to keep it compatible with the BusyBox
> implementations of the core Linux utils
> -- try to make it self documenting
> 
> The "features" are:
> -- global templates that are sym-linked to so changes to the template
> are then reflected to all pages that link to them
> -- the ability to insert additional templates into the normal flow of
> page generation on a per page basis
> -- the ability to override global templates as needed on a per page basis
> -- Static or dynamic pages with shell variable and process
> substitution in the templates
> -- Virtual hosts for hosting multiple domains under a single server instance
> 
> The implementation so far:
> -- althttpd as a web server gives us virtual host directories, static
> html and cgi script, and protected content directories in the public
> web root.  I'm using all these features in Symplate.
> -- And althttpd can just be run on your local machine so you can work
> locally and then tar up the directory and put it on the web server.
> This might also work for HPR archival purposes?
> -- A very basic page directory structure on the server looks like this:
> [sodface at sodbook home]$ tree -l
> .
> ├── index.cgi -> ./-lc/thm/src.sh
> └── -lc
>     ├── 015.tpl
>     ├── 075.tpl
>     ├── page.src
>     └── thm -> ../../-ln/thm/@thm
>         ├── res
>         │   ├── css
>         │   │   ├── modern-normalize.css
>         │   │   └── symplate.css
>         │   └── img
>         ├── src.sh
>         ├── sym.thm
>         └── tpl
>             ├── 000.tpl
>             ├── 010.tpl
>             ├── 020.tpl
>             ├── 030.tpl
>             ├── 040.tpl
>             ├── 050.tpl
>             ├── 060.tpl
>             ├── 070.tpl
>             ├── 080.tpl
>             ├── 090.tpl
>             ├── 100.tpl
>             └── 110.tpl
> 
> This is what a home page might like.  althttpd has a simple file
> search logic it uses when trying to find a page to serve.  If the
> requested url is say, www.sodface.com/home/ (with a trailing slash and
> no actual filename specified) althttpd will try to find a file to
> serve by appending /home, index.html, and index.cgi in that order.
> Some things to note about that:
> -- It lets us construct nice looking urls.  As long as you are
> consistent with a trailing slash, the matched index.html or index.cgi
> is never displayed in the user's browser.  If you leave off the
> trailing slash, a redirect is generated and the index.html or
> index.cgi shows up in the user browser bar.
> -- CGI scripts don't have to be called index.cgi.  That's just what
> althttpd looks for automatically.  Any filename with the +x bit set is
> treated as a CGI script by althttpd so it could just as easily be
> www.sodface.com/home/a-nice-script-name-here
> 
> In the above page example I'm using index.cgi to construct the home
> page.  You can see that the index.cgi file is really a sym-link to
> ./-lc/thm/src.sh  Some notes about that:
> -- A directory prefixed with a dash "-" is not directly accessible via
> a browser url so those directories can be out in the otherwise public
> web root.  For Symplate I decided that each web page would be a
> directory and would contain a protected sub-directory named "-lc" for
> local content.
> -- The "./-lc/thm directory is actually another sym-link back to a
> global protected directory named "-ln" (after the command for creating
> links, as in, "this is the directory that most things link back to")
> So the actual CGI shell script is located starting from the document
> root at ./-ln/thm/@thm/src.sh
> -- An aside, I'm using mostly three character directory names just to
> be terse and sort of Linux like.  It helps to keep directory listings
> cleaner while still making sense, thm is short for "theme", @thm is
> the active theme, res short for "resource" as in images, audio files
> etc, tpl short for "template" etc.  There are some deviations like I
> thought page.src was more self explanatory than pge.src.
> 
> One of the nice things about this structure is that the linked src.sh
> shell script is the same script used by all pages that link to it but
> the script when running is working from the directory that the link is
> in, not the target, local to the requested page in other words.  This
> is handy when constructing the precedence of the templates, which I
> will discuss next:
> -- index.cgi is executed by althttpd which then waits for the output
> to be returned from the script
> -- index.cgi is sym-linked back to src.sh
> -- src.sh is a very basic script, currently 24 lines, some of which
> isn't applicable to the page type above.
> -- for the page above the script does a somewhat clunky three lines
> that produces a shell here-doc file name "page.src"  This file
> represents the html source of the entire page, one step removed from
> being ready to send to the browser.  As I said, the file is shell
> here-doc still, filled mostly with html but could include shell
> variables and process substitution along with the html.  The file is
> built by these lines:
> 
>   echo "cat <<-SYMEOF" > ./-lc/page.src
>   cat $(find -follow -type f -name "[0-9]*" | sed "s/\(.*\)\//\1\/ /"
> | sort -k 2 | uniq -f 1 | tr -d ' ') >> ./-lc/page.src
>   echo "SYMEOF" >> ./-lc/page.src
> 
> The "SYMEOF" bit is just short for Symplate End of File, just tried to
> come up with something that would be likely to show up in normal html.
> The first line just creates the page.src file in the protected local
> content directory and writes the first line of the heredoc.  The cat
> pipeline is pretty clunky I admit and not very readable but it does do
> what I wanted, that is:
> -- find all the files starting with a digit, this matches all the
> numbered template files, the local ones and the globally linked ones,
> for example:
> ./-lc/thm/tpl/030.tpl
> ./-lc/thm/tpl/010.tpl
> ./-lc/thm/tpl/060.tpl
> ./-lc/thm/tpl/040.tpl
> ./-lc/thm/tpl/090.tpl
> ./-lc/thm/tpl/000.tpl
> ./-lc/thm/tpl/020.tpl
> ./-lc/thm/tpl/100.tpl
> ./-lc/thm/tpl/110.tpl
> ./-lc/thm/tpl/070.tpl
> ./-lc/thm/tpl/080.tpl
> ./-lc/thm/tpl/050.tpl
> ./-lc/015.tpl
> ./-lc/075.tpl
> 
> -- that's all the template files but not in the order we want, so
> first pipe into a sed command that puts a space between the directory
> path and the filename.  This is so sort has two fields to work with.
> ./-lc/thm/tpl/ 030.tpl
> ./-lc/thm/tpl/ 010.tpl
> ./-lc/thm/tpl/ 060.tpl
> ./-lc/thm/tpl/ 040.tpl
> ./-lc/thm/tpl/ 090.tpl
> ./-lc/thm/tpl/ 000.tpl
> ./-lc/thm/tpl/ 020.tpl
> ./-lc/thm/tpl/ 100.tpl
> ./-lc/thm/tpl/ 110.tpl
> ./-lc/thm/tpl/ 070.tpl
> ./-lc/thm/tpl/ 080.tpl
> ./-lc/thm/tpl/ 050.tpl
> ./-lc/ 015.tpl
> ./-lc/ 075.tpl
> 
> -- sort sorts on key 2 which is the filename.
> ./-lc/thm/tpl/ 000.tpl
> ./-lc/thm/tpl/ 010.tpl
> ./-lc/ 015.tpl
> ./-lc/thm/tpl/ 020.tpl
> ./-lc/thm/tpl/ 030.tpl
> ./-lc/thm/tpl/ 040.tpl
> ./-lc/thm/tpl/ 050.tpl
> ./-lc/thm/tpl/ 060.tpl
> ./-lc/thm/tpl/ 070.tpl
> ./-lc/ 075.tpl
> ./-lc/thm/tpl/ 080.tpl
> ./-lc/thm/tpl/ 090.tpl
> ./-lc/thm/tpl/ 100.tpl
> ./-lc/thm/tpl/ 110.tpl
> 
> -- the next command, uniq, doesn't do anything with this file list and
> then the final tr just removes the space so we have complete paths to
> pass to cat.
> -- the cool thing about the uniq step is that it allows us to override
> a global template by creating a local template with the same name.  So
> had there been a 010.tpl file in the local directory it would have
> been sorted before the global 010.tpl because of the shorter path and
> uniq would have kept the first one (local) and dropped the second
> (global).  So if there's something about a global templated that
> doesn't quite work on a particular page, we can replace it completely
> without affecting the rest of the site.
> 
> So once that part is finished, we have constructed the page.src
> heredoc. The final part of the script gives us the option of producing
> a static file if desired:
> 
> if [ -e ./-lc/static ]
> then
>   source ./-lc/page.src | tee index.html
>   sed -i -e '1,2d' index.html
> else
>   source ./-lc/page.src
> fi
> 
> By touching ./-lc/static (another instance where I thought a complete
> word was more descriptive than a three character abbreviation like
> stc) the script will see that static flag and then source the page.src
> heredoc so that content is returned to the browser but also tee'd into
> index.html.  The next time the page is requested, althttpd will serve
> the static index.html since it looks for that filename before
> index.cgi.  Explicitly putting "index.cgi" into the url will trigger a
> rebuild of the page.src and index.html.  Removing the static flag file
> and index.html will cause the page to revert back to being script
> generated on every request.  Also, the page.src file doesn't have to
> be built from scratch every page load, if it already exists it gets
> sourced as is (unluess you uncoment the line to force a complete
> rebuild of the page).
> 
> I had about 4 or 5 different command line pipelines to produce the
> correct template order but the sed based one above was fastest and
> works on BusyBox.  Some of the other ones I came up with used find
> arguments that aren't implemented on the BusyBox version of find.
> 
> Anyway, that's basically the general idea.  Another part of the script
> I didn't mention checks for the presense of a "lcl.sh" file and if
> present sources it first before constructing the page.src.  This
> allows you to generate a local template file via a script and do loops
> and whatever else you want to build the local template which then gets
> cat'd into the page.src.  I have a basic example of this at the page
> below.  Which brings me to the links:
> 
> www.symplate.com
> www.sodface.com
> 
> Both of those are being generated by Symplate.  They look terrible but
> I wanted to get something up so I could give you the links in this
> email.  Both domains are hosted on the same server and served by the
> same althttpd instance.  You'll notice they are both using the same
> theme too, which is a complete rip off of the Grav flat file CMS
> structure and color pallette.  I'm rubbish with coming up with an
> original design. It's been a long while since I've touched css or html
> and that was by far the most time consuming part of getting a working
> Symplate prototype together.  If you go to one of the sites and view
> page source, you'll see I commented the template file name in each
> template so you can get a feel for how the page is broken down into
> the various templates.
> 
> Now the negatives:
> -- I don't really know how this would be to use in real life.  Might
> be a complete pain.
> -- Currently, everyting is in html, though I feel like adding some cli
> markdown converters and other simple utilities would be very easy.
> -- Security.  Who knows?  D. Richard Hipp says althttpd is secure
> though I'm not using the chroot becuase it makes doing anything useful
> in the CGI scripts a pain.
> --- and if I'm not accepting posted data, doesn't that reduce a lot of the risk?
> -- The usefulness of the theme idea is questionable.  The idea behind
> the @thm directory was that all your sym-links pointed at it and you
> could just rename a different theme to @thm and all the links would
> still be good but the theme would change.  Seems like it would work
> but becasue of the template numbering system, user page content will
> usually fall into the same template numbers on every page so a new
> theme would have to work the same way or you'd have to renumber all
> you local page templates.  Not really sure how that's going to work
> out.
> 
> Thanks and hope to hear from the community on the value (or lack of)
> of Symplate and whether I should do an episode on it or not.  I have
> some other episode ideas if not!
> 
> -- Sodface.
> 
> On Mon, Apr 15, 2019 at 2:42 PM Ken Fallon <ken at fallon.ie> wrote:
>>
>>
>> On 2019-04-03 20:36, Ken Fallon wrote:
>>> Hi All,
>>>
>>> Do any of you have a recommendation for a Static Site Generators that
>>> just publishes html files.
>>>
>>> For example takes a page, adds a header and footer from somewhere and
>>> publishes the combined page.
>>>
>>>
>>
>> Hi All,
>>
>> Thanks for the suggestions, which I am working through.
>>
>> The question was in relation to the Hacker Public Radio site, which is
>> essentially a LAMP based site that is entirely database driven.
>>
>> For the vast majority of the site this is unnecessary as the pages are
>> very static and change infrequently if ever. Those could be written to a
>> static html file without a problem.
>>
>> The general goal is that everything could be rsynced from a server to
>> your local machine and you would get access to a daily snapshot of the
>> entire website. This would allow us to have multiple mirrors of hpr
>> around the place in the event of another DDOS.
>>
>> So the php page
>> http://hackerpublicradio.org/eps.php?id=0013
>>
>> would be written out to a directory accessible via a index.html page  under
>> http://localhost/episodes/hpr13/
>>
>> This fixes the problem of the episode 9999 bug, and removes the need for
>> a database query for the page.
>>
>> Unfortunately if ever the header and footer change we need to change
>> each and every exported page. HTML5 had a way to include pages together
>> fixing the problem but for some reason support for that has been  dropped.
>>
>> So that is why I was thinking of a flat file CMS. The down side of that
>> is that if there is a change of header, then every single "rendered"
>> html page would need to be downloaded again because the change is
>> incorporated in every single page.
>>
>> However after thinking about it for a while, the people who are helping
>> out by doing this must have the technical expertise to rsync the site
>> locally. So it's safe to assume that they also can follow an instruction
>> page on how to set up a local lamp server.
>>
>> Then we could actually distribute a more or less static html website,
>> but use php to include the header and footer. That would not exclude the
>> need for a flat file cms, but the integration would be more focused on
>> the dynamic content.
>>
>> So I intend to setup a git repo with a index.html page that uses php to
>> include a header and footer. Trying to make the local site at least
>> usable if php or a webserver is not available.
>>
>> I'm not sure if this is even something that would be of interest to
>> people, but if it is, then I will put up links when I have something  ready.
>>
>> --
>> Regards,
>>
>> Ken Fallon
>> http://kenfallon.com
>> http://hackerpublicradio.org/correspondents.php?hostid=30
>>
>> _______________________________________________
>> Hpr mailing list
>> Hpr at hackerpublicradio.org
>> http://hackerpublicradio.org/mailman/listinfo/hpr_hackerpublicradio.org


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://hackerpublicradio.org/pipermail/hpr_hackerpublicradio.org/attachments/20190511/1ae7b103/attachment.sig>


More information about the Hpr mailing list