Some time ago we published the way for Extracting links from a webpage with Python as a first step for publishing complete blog posts in Facebook. The idea was to prepare the text obtained from an RSS feed in order to publish it in a Facebook page (or in other places). Let us remember that Facebook does not allow (or I didn’t find the way) to include html in the pages’ posts.
We had presented previously in Publishing in Twitter when posting here some related ideas, in that case for Twitter.
Now we are going to use the Facebook API and an unofficial package which implements it in Python, Facebook Python SDK.
We can install it with
fernand0@aqui:~$ sudo pip install facebook-sdk
It will need `BeautifulSoup` and `requests` (and maybe some other modules). If they are not installed in our system, we will get the adequate ‘complaints’. We can install them as usual with pip (or our preferred system).
We need some credentials in order to publish in Facebook. First we have to register our application in Facebook My Apps (button ‘Add a new App’ (there are plenty of tutorials if you need help). We will use the ‘advanced setup’ (registering web applications seems to be easier) and some identifiers will be provided (mainly the OAUTH token; we can find them at Myapps, following the link for our app). We will store this token in
~/.rssFacebook, and it will be later used in our program.
This configuration file is similar to this one
The program is very simple, it can be downloaded from rssToPages.py (link to the version commented here, there have been some further evolutions).
The program starts reading the configuration about the available blogs and we need to choose one. If there were just one no selection would be needed:
config = ConfigParser.ConfigParser()
print "Configured blogs:"
for section in config.sections():
print str(i), ')', section, config.get(section, "rssFeed")
i = i + 1
i = raw_input ('Select one: ')
i = 1
print "You have chosen ", config.get("Blog"+str(i), "rssFeed")
The configuration file must contain a section for each blog; each one of them will have an RSS feed, the Twitter account and the name of the Facebook page. For this site it would have the following entries:
Notice that the Facebook account is empty: this blog has not a Facebook page (yet?).
We could have a second blog:
This configuration file can have yet another field,
linksToAvoid that will be used for selecting some links that won’t be shown (I have other blog and in this way I can avoid the categories’ links).
if (config.has_option("Blog"+str(i), "linksToAvoid")):
linksToAvoid = config.get("Blog"+str(i), "linksToAvoid")
linksToAvoid = ""
We will read now the last post of the blog and we will extract the text and links in a similar way as seen in Extracting links from a webpage with Python (not shown here).
And now the links we want to avoid:
print re.search(linksToAvoid, link['href'])
if ((linksToAvoid =="")
or (not re.search(linksToAvoid, link['href']))):
linksTxt = linksTxt + "["+str(j)+"] " + link.contents + "\n"
linksTxt = linksTxt + " " + link['href'] + "\n"
j = j + 1
We then check if the post contains some image. If not, we will not add an image, but Facebook will (it will be the first image that it can find in our page).
We could configure one that would be used in case of need (in case we have not included an image in our post and we do not like the one chosen by Facebook) or we can try to add always to our posts some image.
if len(pageImage) > 0:
imageLink = (pageImage["src"])
imageLine = ""
Now we will read the Facebook configuration and we will ask for the list of pages the user manages (remember that we have established the desired one in
oauth_access_token= config.get("Facebook", "oauth_access_token")
graph = facebook.GraphAPI(oauth_access_token)
pages = graph.get_connections("me", "accounts")
We could define more Facebook accounts but I have not tested this feature, so maybe it won’t work as expected (and, of course, there is no way to select one of them).
for i in range(len(pages['data'])):
if (pages['data'][i]['name'] == pageFB):
print "Writing in... ", pages['data'][i]['name']
graph2 = facebook.GraphAPI(pages['data'][i]['access_token'])
"feed", message = theSummary, link=theLink,
picture = imageLink,
statusTxt = "Publicado: "+theTitle+" "+theLink
This program has been tested during the last months and the solution seems to be working (maybe you’ll want to check the latest version that will have some bugs corrected).
The most cumbersome part was to get the credentials and register the app (with a ‘fake’ production step; for me it is ‘fake’ because I’m the only user of the app).
This post was published originally (in Spanish) at: Publicar en Facebook las entradas de este sitio.
If you have doubts, comments, ideas… Please comment!