-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job data validation with Indeed website #229
Comments
|
just tried and turns out |
just a follow up on this, if we look at a job search of it shows 3 jobs in total but if we do the same search (or at least I think the same) with jobspy: data = scrape_jobs(
site_name="indeed",
search_term="VP of finance",
is_remote=False,
location="Ontario",
results_wanted=200,
hours_old=24 * 7,
country_indeed="Canada",
verbose=True,
distance=int(50 / 1.60934),
) it returns more jobs: but I don't see the 2 out of 3 jobs from the browser in the result:
Even if I extend the @cullenwatson Any idea on what's causing this or I'm missing something? |
I'm trying to validate the (indeed) data I got from JobSpy with the job listings I see directly from Indeed website, given same search params,
such as:
date_posted
for example 10 days old (see attachment), but when I go to indeed and search within last 24hrs, it shows up as well, which means shouldn't that listing havedate_posted
like either Dec 23 (today) or Dec 22? instead of Dec 18?In general is this a robust way to validate the data we get from JobSpy simply by looking up against the Indeed website, or there might be some discrepancies due to non-obvious things?
I took a quick look of the code, and found this line:
f'location: {{where: "{self.scraper_input.location}", radius: {self.scraper_input.distance}, radiusUnit: MILES}}'
I'm searching the ca website, which would default me to
km
for radius, so for validation purposes, I suppose I should set the unit to be somethingKMS
instead ofMILES
?thanks!
The text was updated successfully, but these errors were encountered: