The if else
statement is sending an email no matter what I use in the if
part. I'm not sure what the problem is. I tried a bunch of different operators, namely <3, <=3, <2, == -3, == 3
.
(1.) I wanted to check the number of times a given word occurred in the HTML. (2.) If it occurred equal to or less than the number given, stop. Anything else, proceed (the script goes on to send an email out).
'''Checks school websites for changes in meal assistance during C19 pandemic'''
# Import requests (to download the page)
import requests
# Import BeautifulSoup (to parse what we download)
from bs4 import BeautifulSoup
# Import win32com (to allow email)
import win32com.client
from win32com.client import Dispatch, constants
import re
while True:
#set the url
url = 'https://manteno5.org/news/what_s_new/c_o_v_i_d-19_updates'
#set the headers like we are a browser
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 ( KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
# download the homepage
response = requests.get(url, headers=headers)
# parse the downloaded homepage and grab all text
soup = BeautifulSoup(response.text, "lxml")
#the word I'm looking for
searched_word = 'Grab and Go'
#find how many times the word occurs
results = soup.body.find_all(text=re.compile('{0}'.format(searched_word)), recursive=True)
print('Found the word "{0}" {1} time(s)\n'.format(searched_word, len(results)))
#if the number of times "Grab and Go" occurs on the page is less
#than a given number print "No change"
if str(soup).find(searched_word) == -3:
print("No change")
continue
#but if the word "Grab and Go" occurs any other number of times
else:
#script goes on to send an email...
break
It sent the email even though "Grab and Go" only occurred three times? Print out:
Found the word "Grab and Go" 3 time(s)
Solved! Go to Solution.
if len(results) <= 3:
do stuff
from the line that did the printing, you got the 'len' of results already
if len(results) <= 3:
do stuff
from the line that did the printing, you got the 'len' of results already
The Python str.find() method doesn't return the number of occurrences, it returns the lowest index of the substring.
"Grab and Go" isn't being found 3 times but 2 times while "grab and go" is being found 1 time. The searches you are doing are case sensitive.
I did notice that the bs4 search was case sensitive. I've actually looked into ways to deal with that, but forgot about it.
I used Dan's approach and it worked. But I had to take it out of the while loop because it was running endlessly.
if len(results) <= 3:
do stuff