Scrapy Items auf MongoDB retten

In meinem scrapy Projekt Pipelines.py, ich versuche, um meine geschabten Gegenstände zu MongoDB zu speichern. Allerdings bin ich mir nicht sicher, dass ich es richtig mache, denn nach meinem Kratzen, wenn ich in die Mongo-Shell gehe und die find () -Methode verwende, kommt nichts zurück. Während meines Kratzen zeigen scrapy's Logs mir, dass alle von ihnen die Gegenstände gekratzt worden sind, und mit dem Save to Json-Befehl werden alle meine Artikel erfolgreich geschabt und auf die Json-Datei gespeichert. Hier ist was mein pipelines.py aussieht:

import pymongo from scrapy.conf import settings from scrapy import log class MongoDBPipeline(object): def __init__(self): connection = pymongo.Connection(settings['MONGODB_HOST'], settings['MONGODB_PORT']) db = connection[settings['MONGODB_DATABASE']] self.collection = db[settings['MONGODB_COLLECTION']] def process_item(self, item, spider): self.collection.insert(dict(item)) log.msg("Item wrote to MongoDB database {}, collection {}, at host {}, port {}".format( settings['MONGODB_DATABASE'], settings['MONGODB_COLLECTION'], settings['MONGODB_HOST'], settings['MONGODB_PORT'])) return item 

Und in meinen Einstellungen.py

 ITEM_PIPELINES = {'sportslab_scrape.pipelines.MongoDBPipeline':300} MONGODB_HOST = 'localhost' # Change in prod MONGODB_PORT = 27017 # Change in prod MONGODB_DATABASE = "training" # Change in prod MONGODB_COLLECTION = "sportslab" MONGODB_USERNAME = "" # Change in prod MONGODB_PASSWORD = "" # Change in prod 

Und in meinem scrapy's crawl log

 2014-11-15 15:28:00-0800 [scrapy] INFO: Item wrote to MongoDB database training, collection sportslab, at host localhost , port 27017 2014-11-15 15:28:00-0800 [max] DEBUG: Scraped from <200 http://www.maxpreps.com/high-schools/st-john-bosco-braves-(bellf lower,ca)/football/stats.htm> {'athlete_name': u'Mike Ray', 'games_played': u'4', 'jersey_number': u'9', 'receiving_long': u'7', 'receiving_num': u'1', 'receiving_tdnum': '', 'receiving_yards': u'7', 'receiving_yards_per_game': u'1.8', 'school': u'St. John Bosco Football', 'yards_per_catch': u'7.0'} 2014-11-15 15:28:00-0800 [max] INFO: Closing spider (finished) 2014-11-15 15:28:00-0800 [max] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 283, 'downloader/request_count': 1, 'downloader/request_method_count/GET': 1, 'downloader/response_bytes': 35344, 'downloader/response_count': 1, 'downloader/response_status_count/200': 1, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2014, 11, 15, 23, 28, 0, 613000), 'item_scraped_count': 28, 'log_count/DEBUG': 31, 'log_count/INFO': 35, 'response_received_count': 1, 'scheduler/dequeued': 1, 'scheduler/dequeued/memory': 1, 'scheduler/enqueued': 1, 'scheduler/enqueued/memory': 1, 'start_time': datetime.datetime(2014, 11, 15, 23, 28, 0, 83000)} 2014-11-15 15:28:00-0800 [max] INFO: Spider closed (finished) 

Pythonschalen-Mongo-Abfragen. 'Test' ist nur ein Platzhalter, den ich gemacht habe, um die Sportslab-Kollektion zu machen

 >>> from pymongo import Connection >>> con = Connection() >>> db = con.training >>> sportslab = db.sportslab >>> print sportslab.find() <pymongo.cursor.Cursor object at 0x0000000002ADB438> >>> print sportslab.find_one() {u'test': u'test', u'_id': ObjectId('5466131ca319d723f08d2387')} >>> 

    Python ist die beste Programmiersprache der Welt.