Scraping company ratings from an Indeed job site using R

Although I have experience with R, I am new to HTML and CSS. I have been researching various web scraping methods both online and on Stack Overflow in order to implement them using R. However, I am encountering difficulties when it comes to extracting company ratings from job listing pages. Instead of retrieving the expected rating of 4.0 from the example URL, I keep getting character(0).

Below is my approach:

library(rvest)
library(tidyverse)
library(xml2)

#example URL
url<- "https://www.indeed.com/viewjob?jk=a25a91736b1f7042&tk=1e3q54n49heai800&from=serp&vjs=3&advn=8876452989351355&adid=95236293&sjdu=TDSJNe66qIM3gcXFOG94m--bPylNW2vvO3WAHEKN7JhCAD1FQ-2FXD1gQyElsLNkg6gfXO2CD3rQYOYjO9iXITyFdYOp8tCECkHuDmf3Og8qdMmciGFIv2ahigETjLmuY8uXdLjnQTg4__yOXqHJkA"

page<- read_html(url)


page%>
   rvest::html_nodes("span")  %>%
   rvest::html_nodes(xpath = '//*[contains(concat( " ", @class, " " ), concat( " ", "ratingsContent", " " ))]')%>%
   rvest::html_text()

#Output is 
#character(0)
#It should return 4.0 instead!

Can anyone provide guidance on how to achieve this, and also suggest a method for returning NA if the company rating is missing? Thank you!

Answer №1

It appears that the xpath you are using is incorrect. Upon examining the source document, it seems that the desired value can be found within the content attribute of meta tags with the itemprop attribute set to "ratingValue".

Here is a functional example based on your provided URL:

parse_html(url) %>
  html_elements(xpath = "//meta[contains(@itemprop, 'ratingValue')]") %>
  get_attribute("content") %>
  unique()
#> [1] "3.5"

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

What could be causing the undefined status of my checkUser() function?

I have implemented a brief script on my signup page to define the function checkUser(user) at line 6. In the code section at the end of the HTML for the sign up form, I included an inline script onBlur='checkUser(this) within the <input> named ...

The sidebar in Semantic UI does not have a specified height for the form segment

Need help troubleshooting a button that slides out a vertical sidebar. I want to include a short form for users to input search queries within the sidebar. The issue seems to be with the heights of #search-sidebar and .ui.form.segment. Here is the code s ...

Content Blocks with Added Cushioning

I am striving for the cleanest code possible. Can an article and an aside have padding, margins, etc? I have been struggling to achieve this. I find myself having to add a div with a class that accepts padding. article { width:463px; float:left; ...

Nginx fails to load CSS in Asp.net Core on Raspberry Pi

First and foremost, thank you for taking the time to read my post. Secondly, I apologize for any language errors in advance. I am currently attempting to run an Asp.net core application on a Raspberry Pi, but I am encountering difficulties with nginx prox ...

Does the Fileupload jQuery event function differently when uploading from a mobile device compared to a desktop computer?

I am encountering an issue with this jQuery function. It works perfectly when I upload a file from my desktop computer, but it doesn't seem to fire when I try uploading a file from my mobile phone. Can anyone identify what might be causing this proble ...

Unable to install rJava on openSUSE version 13.2

Attempting to set up rJava on a new installation of openSUSE 13.2 and have already installed JRE and JDK. After using install.packages(), it compiles successfully for some time, but then error messages start appearing: mp -L/usr/lib64/R/lib -lR -lrt -ldl ...

JavaScript PIP video feature

Currently, I am utilizing the requestPictureInPicture function to display the HTML video element in a popup window. However, I have encountered an issue where the size is restricted to approximately 920 x 540 when in Picture-in-Picture mode. I am wonderin ...

Creating a loader for a specific component in Angular based on the view

Creating a loader for each component view is crucial when loading data from an API. Here is the current structure of my components within my <app-main></app-main>: <app-banner></app-banner> <app-data></app-data> <app ...

Enhancing link functionality with jQuery on a dynamically generated server page

I am facing an issue with my navigation menu that includes dropdowns. On desktop, the parent items need to be clickable as well, which is not a problem. However, for it to be responsive on mobile devices, I need to account for the lack of hover capability. ...

Adding text to a Video Js fullscreen window on an iPad

Looking to enhance user experience on iPad while using a video js player, I want to make an image pop up upon completion of the video. Currently, I have it set up to work without full screen mode, but iPad users prefer watching videos in full screen. Is th ...

What is the reason behind shadow dom concealing HTML elements when viewed in inspect mode?

https://i.stack.imgur.com/UZM7f.png Monday.com has implemented Shadow Dom to protect its source code. How can I work around this limitation? ...

Create a new column by analyzing two existing columns and applying various conditions and criteria to assign character values

I am seeking assistance to create a new column in my dataframe by analyzing two existing columns. The data I am working with is displayed below: df job honorary yes yes yes no no yes yes yes yes NA NA no The objective is to gen ...

Troubleshooting Issues with Bootstrap Carousel Functionality

I'm facing some difficulties with the Bootstrap carousel on my website. I have followed all the instructions carefully and researched for solutions, but unfortunately, my carousel is not functioning properly. It seems like everything is set up correct ...

Determine the prior location of an element using jQuery

Is there a way to track the previous location of an element before it is appended? I have 50 elements that need to be appended to different targets based on a certain condition. How can I determine where each element was located before being moved? $(&a ...

Is there a way to trigger the opening of a new file or page when a CSS animation comes to an end?

Is there a way to delay the loading of a function or page until after an animation has finished running in JavaScript, HTML, and CSS only? For instance, I'd like to run an animation first and then have a different website or content load afterwards fo ...

The mobile menu is not responding to the click event

Upon clicking the mobile menu hamburger button, I am experiencing a lack of response. I expected the hamburger menu to transition and display the mobile menu, but it seems that neither action is being triggered. Even though I can confirm that my javascrip ...

Tips for maximizing the height of the "container"

Below is the code snippet that I am working with. I am trying to increase the height of the "container" class, but I have been unsuccessful in finding a solution on Google. Can someone please review my code and provide guidance on how to achieve this? < ...

I encountered an error while attempting to install maptools in Rstudio

Greetings to all! I am seeking guidance on how to successfully install the "maptools" package in R. Despite my attempts to do so manually by downloading from "https://cran.r-project.org/src/contrib/Archive/maptools/", I have been unsuccessful. Each time I ...

Updating the titles of Bootstrap 4 Switches on the fly

Currently utilizing Bootstrap 4 Toggle on my website, I'm struggling to figure out how to dynamically modify the labels on the toggles at runtime. My objective is to implement a toggle-switch that, once activated, displays a countdown indicating the ...

Create a React component using Material UI that remains fixed at the top of the page when scrolling, without being

My AppBar is working fine without using sticky, but I need the element on my page to stick at the top when the user scrolls. I found a solution that seems to work here: https://www.w3schools.com/howto/howto_css_sticky_element.asp However, in my React page ...