"Using html_attr with the attribute "href" does not return any value in the rvest package

My objective is to extract the URLs linked with specific CSS elements on a website using rvest. Despite trying various methods, such as using the html_attr function with the 'href' argument, my current script only returns NA values instead of the expected URLs.

Code snippet for setting up variables

library(rvest)

my_url <- "http://www.sherdog.com/events/UFC-Fight-Night-111-Holm-vs-Correia-58241"

my_read_url <- read_html(my_url)

my_nodes <- html_nodes(my_read_url, ".fighter_result_data a span , .right_side a span , .left_side a span")

Verify if my_nodes correspond to athletes' names

html_text(my_nodes)

Display that my_nodes are selecting the desired CSS elements

[1] "Holly Holm"          "Bethe Correia"       "Marcin Tybura"      
 [4] "Andrei Arlovski"     "Colby Covington"     "Dong Hyun Kim"      
 [7] "Rafael dos Anjos"    "Tarec Saffiedine"    "Jon Tuck"           
[10] "Takanori Gomi"       "Walt Harris"         "Cyril Asker"        
[13] "Alex Caceres"        "Rolando Dy"          "Yuta Sasaki"        
[16] "Justin Scoggins"     "Jingliang Li"        "Frank Camacho"      
[19] "Russell Doane"       "Kwan Ho Kwak"        "Naoki Inoue"        
[22] "Carls John de Tomas" "Lucie Pudilova"      "Ji Yeon Kim"  

Attempt to retrieve URLs for each athlete's unique pages

html_attr(my_nodes, "href")

The output indicates that my efforts only yield a list of NA values

[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

If anyone can provide assistance in successfully obtaining the URLs instead of these NA values, I would greatly appreciate it. Thank you!

Answer №1

Make sure you are selecting the span elements, not the a elements when using the html_nodes function. Remember that only the a elements have an href= attribute, not the span elements. You should adjust your code to:

my_nodes <- html_nodes(my_read_url, ".fighter_result_data a, .right_side a, .left_side a")
html_text(my_nodes)
html_attr(my_nodes, "href")

Answer №2

Like what @MrFlick mentioned, the hyperlinks can be found within <a> tags and you need to access them.

my_url %>%
  read_html() %>%
  html_nodes('.fighter_result_data') %>% html_nodes('a') %>% 
  html_attr('href')
[1] "/fighter/Marcin-Tybura-86928"        "/fighter/Andrei-Arlovski-270"   

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

The audio must start playing prior to being forwarded to a new page

As I delve into the world of website development on my own, I have encountered an interesting challenge. At the top of my webpage, I have embedded an audio file within a button. If the user chooses to mute the audio, the navigation links will remain silent ...

The browser fails to implement styling prior to a demanding workload

Is it possible to refresh a section of my webpage that contains numerous DOM elements? My ideal approach would be to hide the section using visibility: hidden;, then redraw it, and finally set visibility back to visible;. However, I've encountered an ...

Displaying a division when a button is pressed

Despite my best efforts, I can't seem to get the chosen div to show and hide when the button is pressed. <button id="showButton" type="button">Show More</button> <div id="container"> <div id="fourthArticle"> <i ...

JavaScript button with an event listener to enable sorting functionality

I am looking to implement a button that will reset all the filters on my page. Any ideas? Currently, I have multiple radio buttons for filtering items based on price, size, and color. My goal is to create a reset button that will remove all filters and r ...

Retrieve text that is divided by the <p> tags using Xpath

I am struggling to extract formatted text using xpath from a tag that contains multiple <span> and <p> tags. The structure is as follows: <span>This</span> <span> is</span> <span> main</span> <span> t ...

Centering divs using iPad media queries does not seem to work properly

While working on my website, I encountered an issue with displaying content properly on various mobile devices. I have implemented media queries for this purpose. Currently, on the main site, two divs (#wrap and #scrollbar) are positioned next to each oth ...

Creating a centered transparent rectangle of a specific width using HTML

I am working on a page layout similar to this FIDDLE body { margin:0 auto; max-width: 800px; padding:0 0 0 0; text-align:left; background-color:#FFFFFF; background-image: url('http://freeseamlesstextures.com/images/40-dirty-pa ...

What could be causing the "keyframes method" in my css to not function correctly?

When working on the keyframes section, my computer seems to be having trouble recognizing the from part. Ideally, it should display in blue with the opacity command appearing in grey. Unfortunately, neither of these styles are appearing correctly and I&apo ...

Introducing a new feature that allows automatic line breaks when using the detail

Looking to achieve centered text for a summary where the arrow, when clicked, reveals additional details. However, currently, the text and arrow are on separate lines. The goal is to have the arrow on the same line as the summary text and perfectly center ...

Position an anchor image at the top left corner, no matter the DTD or browser

Is there a way to keep an image with an href fixed in the top left corner of a webpage without affecting any other content and ensuring it functions consistently across different browsers and DTDs? I am facing the challenge of needing to provide a code bl ...

Begin one lesson, end all others

Looking for help with my expandable list menu that I created using jQuery. Here is the code snippet: $("#menu ul li ul").hide(); $("#menu ul li").click(function() { $(this).find("ul").slideToggle(); }); You can view the fully functional menu on jsFi ...

Using PHP to download a file

I have successfully uploaded a PDF file to a directory named 'documents' on my web server. Currently, I am populating a table with data from my database, and I want to create links in the forms column that directly link to the associated files w ...

What are the steps for positioning tables using HTML?

As a newcomer to this field (literally just started two days ago), I have managed to generate tables from an XML file using my XSL file. However, aligning these tables properly has become a bit of a challenge. I initially used the align attribute of <ta ...

Managing the vertical dimensions of a div

I've created a unique set of visually appealing cards that house meaningful messages within an infinite scrolling feed. The intended functionality is for the complete message to be displayed upon clicking a "more" button, as the messages are typically ...

Adjust the class based on the number of li elements

When there are more than two (li) tags, the ul tag should have the class "col_3"; otherwise it should have the class "col_2" For instance: If there are two columns <div class="container"> <ul class="col_2"> <li>One</li> ...

using variables in sql within R

I'm having trouble getting this code to work. Could it be because "from" is a reserved keyword? join_1 <- sqldf("select distinct ds1.from from ds1 except select distinct ds2.name from ds2") An error occu ...

Retrieve information from the selected row within a table by pressing the Enter key

I have a search box that dynamically populates a table. I am using arrow keys to navigate through the rows in the table. When I press the enter key, I want to retrieve the data from the selected row. For example, in the code snippet below, I am trying to ...

"Utilizing multiple class names in Next.js to enhance website styling

Looking for a way to apply multiple classNames in Next.js, especially when dealing with variable classnames? I'm following the component level CSS approach. Take a look at my code and what I aim to achieve: import styles from "./ColorGroup.mod ...

Does the a:hover:not() exception only apply to elements with the .active class?

I developed a navigation bar and split it into 3 distinct sections: 1. id="menu" 2. id="fake" (button to fill space) 3. id="social" My main goal is to exclude the .active page and the fake button from the hover effect, but it seems that using a:hover:not( ...

Begin the jQuery ResponsiveSlides Slider with the final image in the <ul> list

Currently utilizing the responsiveSlides image slider from responsiveSlides on our website. This jQuery slider uses an HTML unordered list of images to slide through automatically. The issue I'm facing is that before the slider actually starts (meani ...