Here is an example html
input:
<div class="grid--item user-info user-hover">
<div class="user-gravatar48">
<a href="/users/22656/jon-skeet">
<div class="gravatar-wrapper-48"><img src="https://www.gravatar.com/avatar/6d8ebb117e8d83d74ea95fbdd0f87e13?s=96&d=identicon&r=PG" alt="Jon Skeet's user avatar" width="48" height="48" class="bar-sm"></div>
</a>
</div>
<div class="user-details">
<a href="/users/22656/jon-skeet">Jon Skeet</a>
<span class="user-location">Reading, United Kingdom</span>
<div class="-flair">
<span class="reputation-score" title="reputation score 1,440,518" dir="ltr">1.4m</span><span title="873 gold badges" aria-hidden="true"><span class="badge1"></span><span class="badgecount">873</span></span><span class="v-visible-sr">873 gold badges</span><span title="9172 silver badges" aria-hidden="true"><span class="badge2"></span><span class="badgecount">9172</span></span><span class="v-visible-sr">9172 silver badges</span><span title="9224 bronze badges" aria-hidden="true"><span class="badge3"></span><span class="badgecount">9224</span></span><span class="v-visible-sr">9224 bronze badges</span>
</div>
</div>
<div class="user-tags">
<a href="/questions/tagged/c%23">c#</a>, <a href="/questions/tagged/java">java</a>, <a href="/questions/tagged/.net">.net</a>
</div>
</div>
I am specifically looking for the first two span
elements directly under the div
with the class class="user-details"
.
I have attempted to select them using a CSS selector, but it returns more results than anticipated: https://try.jsoup.org/~orgt_meWno3AxzO0GzxMBldEhIk
My Python implementation also yields the same issue:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
soup.select('div.user-details span:nth-child(-n+2)')
# [<span class="user-location">Reading, United Kingdom</span>,
# <span class="reputation-score" dir="ltr" title="reputation score 1,440,518">1.4m</span>,
# <span aria-hidden="true" title="873 gold badges"><span class="badge1"></span><span class="badgecount">873</span></span>,
# <span class="badge1"></span>,
# <span class="badgecount">873</span>,
# <span class="badge2"></span>,
# <span class="badgecount">9172</span>,
# <span class="badge3"></span>,
# <span class="badgecount">9224</span>]
The expected output should only contain these elements:
# [<span class="user-location">Reading, United Kingdom</span>,
# <span class="reputation-score" dir="ltr" title="reputation score 1,440,518">1.4m</span>]
If possible, can you help me identify any issues in my CSS selector and suggest corrections?
My query pertains more to CSS selectors rather than alternative solutions.