Issues with scraping database-backed data using Ruby for web scraping

Currently, I am trying to extract the Name, Address, Phone Number, and Email Address of different resorts from certain page(s).

Link to Page

I'm fairly new to Ruby and have been searching for examples, but it seems too specific to find a suitable solution.

My focus right now is on extracting the Email Address. After inspecting the element and noting the CSS path (#category-listings > li:nth-child(1) > div > div > ul > li:nth-child(2) > a)

I have created a ruby script to try to retrieve this data:

require 'nokogiri'
require 'open-uri'

PAGE_URL = "http://www.exploreminnesota.com/places-to-stay/resorts/?keywords=&pageIndex=0&radius=0&mapTab=false&sortOrder=asc&sort=randomdaily&locationid=&startDate=false&class_id=7&lat=&lon=&city=&pageSize=20&type=reitlistings&attrFieldsOr="

page = Nokogiri::HTML(open(PAGE_URL))

site1 = page.css(' #category-listings  li:nth-child(1)  div  div  ul  li:nth-child(2) a')
puts site1

The output:

href="mailto:**%7B%7Br._source.database_fields.email%7D%7D"** class="button gaTracker" title="**{{r._source.database_fields.email}}**" data-tracker-type="event" data-category="Email" data-label="{{r._source.location.split('/')[1]}}" data-action="{{url | analyticsAction}}">Email

As you can see, instead of the email address, the title displays as : r._source.database_fields.email

When examining this element, the data appears like this:

href="mailto:<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f88e999b998c919796b89f8d9494cc8b9d998b97768bd69b9795">[email protected]</a>" class="button gaTracker" title="<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="baccdbd9dbced3d5d4faddcfd6d68ec9dfdbc9d5d4c994d9d5d7">[email protected]</a>" data-tracker-type="event" data-category="Email" data-label="gull-four-seasons-resort" data-action="Places to Stay">Email

I am unsure how to access the data as seen in the browser console. Any guidance would be greatly appreciated, not only helping me understand HTML/CSS better but also how data is extracted onto a page from a database.

Thank you!

Answer №1

To access specific JSON data from exploreminnesota.com, you can use the code below. This eliminates the need for Nokogiri and directly retrieves the desired JSON data. The response is then converted into a Ruby JSON object and displayed in a readable format on the terminal.

require "open-uri"
require "json"

url = "http://www.exploreminnesota.com/getJsonData.ashx?id=61&keywords=&pageIndex=0&radius=0&mapTab=false&sortOrder=asc&sort=randomdaily&locationid=&startDate=false&class_id=7&lat=&lon=&city=&pageSize=20&type=reitlistings&attrFieldsOr="

response_file = open(url) # Make HTTP request and save as temp file
response_json = JSON.parse(response_file.read) # Convert response to JSON

puts JSON.pretty_generate(response_json)

The URL assigned to url includes getJsonData.ashx, ensuring that only JSON data is retrieved rather than HTML content.

To find this URL, you can use the Chrome inspector. By viewing the Network tab with cache disabled and filtering for XHR requests, you can identify the relevant network request containing the JSON data you need. You can further explore this data within the inspector by expanding and collapsing objects in the Preview tab.

If you want to display a resort's email address using the example above, add the following line:

puts response_json["hits"]["hits"][0]["_source"]["database_fields"]["email"]

This will print the email address of the first resort in the JSON response.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Placing a box in the center of its parent container using CSS positioning

Check out this code snippet : HTML <div class="app-cont"> <div class="app-head"> Additional Comments : </div> <div class="app-main"> balallalalalallalalalala <br/> jaslnflkasnlsnlk ...

Tips for customizing the appearance of a functional stateless component in Reactjs using class objects

I am looking to create a functional stateless component in ReactJs following the guidelines outlined here. const MyBlueButton = props => { const styles = { background: 'blue', color: 'white' }; return <button {...props} sty ...

Ways to utilize CSS for capturing user-inputted text in a text field

I have a question about incorporating user input text into CSS styling. Specifically, I am looking to create a color background option (#ffffff) where the designer can input their own color and have it displayed once saved in the body background style. Wh ...

Is there a way to create this dynamic design without relying on JavaScript?

I am attempting to achieve a specific design without relying on JS, a demonstration can be viewed at jsfiddle.net/k2h5b/. Essentially, I want to showcase two images, both centered, one as the background and one as the foreground: Background Image: Shoul ...

React Fixed Footer Implementation against My Preferences

Here's an issue that I'm facing: https://i.stack.imgur.com/gtQqm.png The footer on my webpage is normally displayed at the bottom of the page. However, when the user performs certain actions that extend the size of the page: https://i.stack.im ...

Struggling to create a connection between model tables in Django, I encountered an issue when attempting to input values. The error message stated that the foreign key item must be an

class User(AbstractUser): pass class auction_list(models.Model): item_id=models.IntegerField(primary_key=True) item_name=models.CharField(max_length=64) owner=models.CharField(max_length=64) image=models.CharField(max_length=128) de ...

Adjust the size of a large column to fit into a smaller one using Bootstrap or vanilla CSS

I've experimented with various methods and it seems like achieving this without JS might not be possible - but before I throw in the towel, I wanted to reach out for help here. I have a lengthy navigation div on the left side and a column of dynamic ...

Move a div horizontally by 100% using translateX, then bring it back into view by a specific number of pixels

I am currently working on creating a custom push navigation system and I am using translateX to shift the content over to reveal the navigation menu. However, I am facing an issue where I don't want the content to be completely pushed off screen, but ...

Three.js ensures that the mesh texture does not stretch, instead it covers its container perfectly

I have a container where I apply an image using three.js and mesh. Here's how I add my mesh to the scene: this.$els = { el: el, image: el.querySelector('.ch__image') <-- size of container image is applied to }; this.loader = ne ...

How do I apply the !important rule to a CSS transform with multiple values?

Is it possible to undo a previous css 'transform' applied to an element and apply a new rule? Although we often use '!important' in css to override rules with higher priority, it seems that '!important' is not working on a tra ...

The vertical scrolling feature seems to be disabled for ng-repeat within the Angular view

I have a list of customers coming from the backend that I need to populate into ng-repeat. However, even though there are more customers, the vertical scroll is not visible. Why is that? <link rel="stylesheet" href="hike.css"> <div ng-show=' ...

Maintain the basic structure while ensuring divs are properly aligned

Behold, my div structure! ONE // at the top level TWO THREE // two divs side by side in the middle level FOUR // residing at the bottom <div id="ONE"> <div></div> <div><img src="logo.png"></div> </div> < ...

Applying the margin auto property and setting display to inline block within a while loop

I am utilizing a while loop to showcase various groups of people in different tables. I have discovered that when I use inline-block, the margin auto function does not work. I would like these two tables to be displayed inline and centered in the b ...

What is the proper way to align text based on the text above using CSS?

I have a container with a width of 50% of the viewport width, which contains some text. I want to align some other text below it to the right side. You can find an example here: https://jsfiddle.net/hadr4ytt/1/ Here is the current CSS for this: .containe ...

concealing a div with smarty

I am working with a Smarty template that includes company IDs for multiple companies. My goal is to use an if statement to display a Google map in a div if the company ID matches a certain number, and hide the div if it does not match that number. I have ...

The active class in CSS will not function properly; only the default Bootstrap active class will be

No matter what I do, I can't seem to change the active background in CSS. It keeps defaulting to the btn-primary bootstrap style. <button id="home" type="button" class="btn btn-primary active" href="#home">Home</button> .btn:active, .b ...

What is the best way to enable an element to be dragged from just a single point?

I am facing a challenge with a parent div (such as "#strip17") that contains multiple child divs. Most of these child divs consist of a canvas where the user can draw using their mouse. However, the last child div should act as a handle that allows the use ...

How can we best prevent images from overflowing their parent <div> in a straightforward and efficient manner?

I seem to be facing an issue with the code snippet provided. The intention is for the image to fit neatly inside the card with some padding around the edges, but it seems to overflow the parent div instead. Snippet: <div class="card greeting" ...

"Incorporate Bootstrap drop-down and button together for a stylish group feature

I am looking to format the code below to resemble the layout shown in the image using Bootstrap 3.3. When there isn't enough space to show all elements in a single line, I want them to stack vertically so that each element utilizes the entire width of ...

Is it possible to utilize the ternary operator to handle classnames when working with CSS modules?

I am experiencing difficulty implementing a styling feature using the ternary operator within an element's className. My goal is to have three buttons that render a specific component when clicked. Additionally, I want to change the background color ...