
Are you a regular coursera user? If yes, then you might like this python program. You can actually use the same program for similar course websites like edx, alison, udemy, etc.
So, let’s start. If you have some experience with python and BeautifulSoup then you already have everything you need so make your own. The ones that don’t, follow along.
Install python for your operating system. After that install requests and BeautifulSoup like this:

After that, lets import the modules, and make a variable to store the url:
import requests
baseUrl = “https://www.coursera.org”
Now, take an input from cli or initialize it:
skillset = input().split(“ “)
Now, we’ll see the query url for the search input:
example: java

Fig.1. Search field in a course website

Fig.2. Check the url and find a pattern
So, the important part is the after the “query=” part. So, we will append the input from the user here.
courseraUrl = “https://www.coursera.org/search?query=" + skillset
Now, we will make a request to a web page, after that we’ll run the page.text document through the module to give us a BeautifulSoup object (that is, a parse tree from this parsed page) that we’ll get from running Python’s built-in html.parser over the HTML.
soup = BeautifulSoup(page.text, ‘html.parser’)

Fig.3. Copy class of h2 tag for the course header
So, we will use BeautifulSoup.object.find_all() function to get all courses names. The class name will be the same for the below courses:
found = soup.find(“h2”, {‘class’: “color-primary-text card-title headline-1-text”})
You can do similarly to get the link of the course, like this:

Fig.4. Copy class of a tag for course url
foundU= soup.find_all(“a”, {‘class’: “rc-DesktopSearchCard anchor-wrapper”})
Now we will use a loop and print all the names and urls:
for courseName in found_all:
print(courseName.text)
for courseUrls in foundU_all:
toUrl = courseUrls.get(‘href’)courseUrl = baseUrl + toUrl
print(courseUrl)
Check out the output:

Fig.5. CLI Output
It’s ugly and not of any use right? Well, I moved this output to a csv file using csv module in python.
with open(‘course.csv’, ‘w+’, newline=’’) as file:
myFields = [‘courseName’, ‘courseUrl’]
writer = csv.DictWriter(file, fieldnames=myFields)
writer.writeheader()
for i in range(len(found_all)):
# for course urls
toUrl = foundU_all[i].get(‘href’)
courseUrl = baseUrl + toUrl
# to store it in dictonary courseName -> courseUrl
dict_course[found_all[i].text] = courseUrl
writer.writerow({‘courseName’ : found_all[i].text, ‘courseUrl’: courseUrl})

Then I wrote a html file for converting this into HTML Table.

My github link for this program: github
Thanks for reading :)
Comments
Post a Comment