Flask 2: RSS

our basic app, RSS, RSS in our flask app

Sep 02, 2024

Let’s pick up where we left off last time.

1 - Our basic app

First up, we’re going to lay down the basics for our new Flask app, which is pretty similar to the Hello World app we just did. Open up your text editor and create a file named headlines.py. Here's what you'll need to write in it:

from flask import Flask

app = Flask(__name__)

@app.route("/")
def get_news():
  return "no news is good news"

if __name__ == '__main__':
  app.run(port=5000)

Here’s what happens: If you go to the main page of the app (“/”), it will show the message "no news is good news." The last part of the code makes sure that if you’re running this file directly, it will start the app on your computer using port 5000.

2 - RSS

RSS is like an old school way to keep up with updates from websites. It stands for things like Really Simple Syndication or Rich Site Summary, but most people just call it RSS. It uses a format called XML to organize and display content in a neat and orderly way. This is super handy for reading news articles without having to revisit the website over and over to check if there's anything new.

Think of how a news website works, where the big stories get the most space and stick around longer. If you’re visiting these websites often, you might see the same stories repeatedly. Then there are places like some personal blogs that hardly ever update, and you find yourself checking them for nothing most of the time. RSS feeds fix this by letting you subscribe to a site's updates.

Because RSS has a clear structure, it’s easy to automatically grab and use details like headlines, article text, and publication dates with a bit of Python programming. We’ll use RSS feeds from BBC news to show news in our app.

While it's totally possible to write our own code to handle RSS feeds, we'll use a Python library called feedparser to make things easier. This library deals with the quirks of different RSS versions and lets us work with the data smoothly. To get started, just open up your terminal and type in a command to install feedparser.

pip install --user feedparser

For this Flask app, we’ll use the BBC’s RSS feed provided on this link: https://feeds.bbci.co.uk/news/rss.xml

3 - RSS in our flask app

In our headlines.py file, we’re going to tweak things a bit to use the feedparser library we just installed. This will help us read the RSS feed and grab the first article from it. Then, we'll format this article with some simple HTML to display it in our app. If HTML sounds new to you, it stands for Hyper Text Markup Language, and it's what's used to set up the appearance and format of text on web pages. If you've never used it before, you might want to check out a beginner’s tutorial. There are plenty of free ones online—W3Schools, for example.

So, what we're adding to our code is: an import statement for feedparser, a new line to store the URL of the RSS feed, and a bit of logic to process that feed, pull out the information we want, and wrap it up in basic HTML. The changes will look something like this:

import feedparser
from flask import Flask

app = Flask(__name__)

BBC_FEED = "http://feeds.bbci.co.uk/news/rss.xml"

@app.route("/")
def get_news():
  feed = feedparser.parse(BBC_FEED)
  first_article = feed['entries'][0]
  return """<html>
    <body>
        <h1> BBC Headlines </h1>
        <b>{0}</b> <br/>
        <i>{1}</i> <br/>
        <p>{2}</p> <br/>
    </body>
</html>""".format(first_article.get("title"), first_article.get("published"), first_article.get("summary"))

if __name__ == "__main__":
  app.run(port=5000, debug=True)

Here's what happens with our code: First, we tell our feedparser library to grab the RSS feed from the BBC website. It fetches the feed, breaks it down, and turns it into a format that Python can work with easily, like a dictionary. Next, we pull out the very first news article from that feed and save it in a variable. This dictionary has a bunch of entries—one for each news story—so we just take the top one and get its headline, publication date, and a short summary.

In the code where we return the result, we're basically creating a super simple web page using HTML. HTML uses different tags to format text, like <html> and <body> for the structure of the page, <h1> for the main heading, <b> to make text bold (which we use for the headline), <i> to italicize text (we use this for the date), and <p> for paragraph text (where we put the summary). We use a special Python technique with .get() to pull this info safely, which means if anything's missing, it won't crash our program; it'll just leave it out.

We didn't include error handling here, so there are a few things that could go wrong, like if there's no internet or if the BBC’s feed has a problem. Normally, you'd want to handle these possibilities with some error-catching in a real app. Also, in a professional setting, we wouldn't mix HTML code directly inside Python like this; we'll cover a better way to manage HTML in the next chapter. For now, if you fire up your web browser and go to our page, you’ll see a very basic display of the news story, something like this:

Data Science & Machine Learning 101

Discussion about this post