This is Hacker Public Radio Episode 3,654 for Thursday 4 August 2022. Today's show is entitled, use the data in the org feed to create a website. It is hosted by Norrist and is about 13 minutes long. It carries a clean flag. The summary is. How much of a site can I make using only the data from the feed? Welcome back. This is a part two of my experiment to see how much I can get done with the data that's just in the RSS feed for our Hacker Public Radio or just the org feed specifically. If you want to check back part one was HPR episode 3637 and then that episode talked about how I took what I felt was the most interesting bit from the RSS feed and inserted the information from the feed and inserted it into a SQLite date. So today I'll discuss how I took the data that I had stuck in that SQLite database and created a static website. So a couple of quick things but before we jump into the details I think I probably could have skipped the database step where I take the data from the feed and put it in a database and then take the data out of the database and use it to build a site. It's probably could have gone straight from the feed to what I was using to process it. I was using to build a static site. It was extra code in extra time but that's how projects go sometime. The first time you do it you think you want to do it and you want to do it and then you realize maybe there was some extra code or extra steps that you didn't necessarily need. I wanted to manage of putting it in an extra database first though it didn't work in a cache so that every time you built the site you wouldn't have to pull a feed in. And then the other thing I wanted to say real quick was I was really struck by just a total number of episodes there out there for hacker public radio that's a lot of work that's been put into building over 3,000 plus episodes so just a quick thanks to everyone who's ever great in episode I really need. Some original intent when I started the project was that I would use Markdown to build the site and a lot of static site generators like Hugo or Jack Old, they sort of work with Markdown Piles where you build a bunch of Markdown and you throw it at the static site generator and I just build a nice look at site. I started down that path but then one thing about Markdown is that you can add in line HTML if you need to and I started with just Markdown and I couldn't get it to look. I didn't look like I wanted to so I started adding HTML elements and then by the time I got the site to look like a website there was more HTML than Markdown so I just kind of scrapped the Markdown base now I can hear all of you Markdown detractors out there saying of course Markdown is terrible but I was never in trying to build a website with it and you know I'm a big fan of Markdown I use it all the time if you're gonna do something like taking notes writing documentation I think Markdown is a great tool for doing it but it didn't pick this particular use case so when I ended up doing was instead of I'll talk I'll talk in a second about the templating that I did but instead of taking database instead of taking data out of the database and templating it into Markdown I just templated it directly into HTML and just sort of skip the step of converting Markdown HTML a couple of the libraries that I used to do the work was one the P.E. Python library that's used to translate database calls into something a little more pactonic I talked about that in the last episode and then to do the templating I used ginger it's a it's a pretty easy template language something I was already familiar with so it seemed like a good fit so strictly speaking I'm wanted to use only the data from the feed to create or to recreate a website and that proved to be hard not impossible you certainly do it but if I wanted to introduce things like logo or headers and footers and a little bit of styling you have to put it pull in some extra content so aside from the data that I got from the RSS feed I wrote an HTML header and footer for every page in the header I'm pulling in the bootstraps CSS so I can use bootstraps to do some of the layout using the bootstraps problems and I've also got the HPR logo in there and then in a footer I'm basically copy the footer from the HPR site so it's got links to related projects and it's got the copyright information in the HTML so I was able to build four different pages or four different types of pages from the using the data from the feed I built sort of a replica of the main page for HPR where it lists the most recent episode I also built a page per episode so and then I also built one page that lists every episode so for the episodes specific steps as a main page it shows the recent ones there's a all episodes page where lists every episode and then there's one page per episode and that's where you can kind of drill down to the episode and read the show notes and then for the host did something similar where I built one page that lists every host and in the table it's got their host name also calculated how many shows every host produced and up with that and there and up with the data that I'll show all that's in a table and then for each host has their own individual page it will list their all of their shows and there should be links I tried to create links where it makes sense so if you're on one of the episode pages the host name the name of the contributor or the host should be should be a link to that individual host page so there's a lot of data on the HPR website it is not in the feed and that shouldn't be surprising the community feed isn't meant to be a website or in mid to replace the website it's meant to give you an individual information about shows so I couldn't exactly recreate the website of HPR website using the data just in the feed because there's some stuff that's just not there so for example on the host pages each individual host has a profile and they will list maybe a web page or avatar or something like that that's not in the feed for individual shows there's things like the tag information the series if the show was a part of a series neither of those are in the feed there's also a show summary when every submit a show you have to give it a short hundred word hundred character or less summary and that's that's not in the that I could on to maybe it's there but I could on it and then finally missing was the license I couldn't find that in a feed information and then of course there's some web pages on the HPR site that I would enable to replicate because they're just they don't have anything to do with individual shows so they're obviously not going to be in the feed but pages like the way you need to know or how to help out or request the topics there was really no way to recreate those from just the web page or from just RSSD so just a little quick overview of how the project works I'll have a link later to the get lab page so you can you can see for yourself but like I said earlier I use the P.E. to read from the SQLite file then I've got a Python script that pulls the data out of the SQLite file aggregates it kind of packages it up a little bit and then uses ginger templates to build the pages there's a template for the index page or the main page there's a template for the also page where I list all the shows out each individual contributor has their own page and that's got a separate template and then for the correspondence page for every host on one page that's kind of separate to put so some things I'd like to do next with a project one I'd like to try and incorporate the comments there is an RSSD for comments I haven't looked at it yet but I think it would be possible to take the RSSD for comments and match them with the RSSD for the individual shows and then be able to show display the comments that are on the page or per show comments I think I can re-create the RSSD from the data in the in the that collect and SQLite I know that seems a little things funny to me kind of seeing it out loud but taking a RSSD and data base and then re-creating a separate RSSD but I think just for the sake of trying to build the most complete site possible I think that's something I'm going to look at see if I can rebuild the RSSD I'm not sure how but I think I would like to try and figure out how to get the pages that aren't in the feed into a static site try to re-create the pages and I mentioned earlier like what you should know and pages like that how to how to re-create those then next the I mentioned one of the things that are missing from the RSSD this tags I think I really think it might be possible to use some natural language processing or some keyword extraction or something like that and and see if I can generate some tags for the shows or keyword for the shows and then sort of the final thing I want to do list is to modify how I grab the data from the feed and insert it into the database and there's two feeds for HPR I think most people use the latest feed which has 10 episodes and it's also a full feed it's got every episode and so what I would like to be able to do is the first time you run the Python script to build the database the first time you run it it uses a full feed and then subsequent times it uses the most recent feed and quite figuring out how to do that yet but it so I'll have a link to the GitLab page in the show notes I'll welcome full request or comment in the episode or anger emails or just however you want to if you feel like you haven't improvement or any suggestions or obviously welcome and then I'll also link to static site where build the site copy it up to a web host it's a really that real quick it's HPR.norst.x honestly if you want to you just want to look and see how the site turned out i put a copy out on the internet think i've got it set up to do a daily update but we'll see we'll see how that goes and that's it thanks for listening and I'll see you guys next time you have been listening to hacker public radio as hacker public radio does a work today's show was contributed by a HPR listener like yourself if you ever thought of recording podcast you click on our own tribute link to find out how easy it means hosting for HPR has been kindly provided by an onsthost.com the internet archive and our sing.net on this otherwise stages today's show is released on our creative comments attribution for pointo international license