diff options
author | Joachim Filip Ignacy Bartosik <jbartosik@gmail.com> | 2010-08-03 19:06:03 +0200 |
---|---|---|
committer | Joachim Filip Ignacy Bartosik <jbartosik@gmail.com> | 2010-08-09 22:53:28 +0200 |
commit | e2ac024317f3568b7229452e5f5e3859eb91d2ad (patch) | |
tree | a64c8c0ae509fed2709efb3c7988d69f225e188b /lib/tasks | |
parent | When obtaining lead data from gentoo.org use XPath to find nodes in XML (diff) | |
download | recruiting-webapp-e2ac024317f3568b7229452e5f5e3859eb91d2ad.tar.gz recruiting-webapp-e2ac024317f3568b7229452e5f5e3859eb91d2ad.tar.bz2 recruiting-webapp-e2ac024317f3568b7229452e5f5e3859eb91d2ad.zip |
When collecting lead data collect include subproject leads
Diffstat (limited to 'lib/tasks')
-rw-r--r-- | lib/tasks/prepare.rake | 29 |
1 files changed, 25 insertions, 4 deletions
diff --git a/lib/tasks/prepare.rake b/lib/tasks/prepare.rake index d9a90fc..f45f0dc 100644 --- a/lib/tasks/prepare.rake +++ b/lib/tasks/prepare.rake @@ -39,13 +39,34 @@ namespace :prepare do # Fetch xml document from uri. Collect all elements with name equal to # name parameter and array containing them (and only them). def get_all_tags_with_name(uri, name) - raw_data = Net::HTTP.get_response(URI.parse(uri)).body - project_data = REXML::Document.new(raw_data) - REXML::XPath.match(project_data, "//#{name}") + begin + raw_data = Net::HTTP.get_response(URI.parse(uri)).body + project_data = REXML::Document.new(raw_data) + REXML::XPath.match(project_data, "//#{name}") + rescue + # Warn if there was some error and return empty array. + # This way problems with one document won't break whole process. + # Down side is that leads may be marked as non-leads (this + # wouldn't happen if task crashed when encountering a problem) + Rails.logger.error "Error when trying to collect <#{name}> tags from #{uri}:" + Rails.logger.error $! + [] + end end devs = [] - projects = get_all_tags_with_name('http://www.gentoo.org/proj/en/metastructure/gentoo.xml?passthru=1', 'subproject') + + projects = [] + subprojects = get_all_tags_with_name('http://www.gentoo.org/proj/en/metastructure/gentoo.xml?passthru=1', 'subproject') + until subprojects.empty? + projects += subprojects + + new_subprojects = subprojects.collect do |proj| + get_all_tags_with_name("http://www.gentoo.org#{proj.attribute('ref').to_s}?passthru=1", 'subproject') + end + + subprojects = new_subprojects.flatten + end for proj in projects devs += get_all_tags_with_name("http://www.gentoo.org#{proj.attribute('ref').to_s}?passthru=1", 'dev') |