Skip to content

Instantly share code, notes, and snippets.

@inutano
Created July 3, 2025 06:00
Show Gist options
  • Save inutano/770265d8fbefe8e187ad2478d92f1496 to your computer and use it in GitHub Desktop.
Save inutano/770265d8fbefe8e187ad2478d92f1496 to your computer and use it in GitHub Desktop.
require 'nokogiri'
require 'csv'
require 'open-uri'
urls = [
'https://www.iscb.org/cms_addon/conferences/ismbeccb2025/posters.php?track=BOSC&session=A#search',
'https://www.iscb.org/cms_addon/conferences/ismbeccb2025/posters.php?track=BOSC&session=B#search',
'https://www.iscb.org/cms_addon/conferences/ismbeccb2025/posters.php?track=BOSC&session=E#search'
]
csv = CSV.generate do |csv_out|
csv_out << ['Title', 'Poster Presenter', 'PosterID', 'Day', 'SubmissionId']
submission_counter = 1000
urls.each do |url|
html = URI.open(url).read
doc = Nokogiri::HTML(html)
doc.css('div.well-sm strong').each do |title_elem|
title_text = title_elem.text.strip
if title_text.start_with?('Virtual:')
poster_id = 'Virtual'
title = title_text.sub('Virtual:', '').strip
day = 'Virtual'
elsif title_text =~ /^(A|B|E)-\d+: /
poster_id, title = title_text.split(':', 2).map(&:strip)
day = poster_id.start_with?('A') ? 'Jul-21' : poster_id.start_with?('B') ? 'Jul-22' : 'Jul-23'
else
next
end
container_div = title_elem.ancestors('div.well-sm').first
next_divs = container_div.xpath('following-sibling::div')
presenter = 'UNKNOWN'
next_divs.each do |div|
if div.to_html.include?('<ul') && div.css('li.author').any?
presenter_elem = div.at_css('li.author u strong') || div.at_css('li.author strong')
if presenter_elem
presenter = presenter_elem.text.strip.split(',').first
end
break
end
end
csv_out << [title, presenter, poster_id, day, "S-#{submission_counter}"]
submission_counter += 1
end
end
end
puts csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment