hadoop mongodb connector - output data not as mongodb but hdfs

is it possible read mongodb data from hadoop mongodb plugin connector, processing data with mapreduce hadoop, and when output results not using hadoop mongodb plugin connector but leave result of mapreduce hadoop as is, in hdfs?


I think this previous answer on SO answers your question, with a minor change:

Is it possible to read MongoDB data, process it with Hadoop, and output it into a RDBS(MySQL)?

The main difference is that you would set the OutputFormatClass to something like:

job.setOutputFormatClass( SequenceFileOutputFormat.class );

You'll also need to set the output path on HDFS you want to save the data to. See their WordCount example for a full code example, but use the above as the output format instead of MongoOutputFormat.

Category:mongodb Time:2012-04-01 Views:1

Related post

  • hadoop with mongodb plugin - read data 2012-03-26

    I know that it is possible read and write data from mongodb via hadoop. I want know if this adapter when read data from mongodb collection use native driver of mongodb, so it use mongod instance or this adapter read directy data collection? Also when

  • How is the data in a MongoDB database stored on disk? 2010-11-08

    I know that MongoDB accepts and retrieves records as JSON/BSON objects, but how does it actually store these files on disk? Are they stored as a collection of individual *.json files or as one large file? I have a hunch as to the latter, since the Mo

  • Ways to implement data versioning in MongoDB 2010-11-15

    Can you share your thoughts how would you implement data versioning in MongoDB. (I've asked similar question regarding Cassandra. If you have any thoughts which db is better for that please share) Suppose that I need to version records in an simple a

  • nosql mongodb replicaset autosharding, data loss while running an experiment 2011-01-17

    I was continuously inserting data into autosharded mongodb. (not updating, only insertion ) Experimental setup 2 shards, each shard has 2 nodes. in shard1 -> node1 (primary), node2 in shard2 -> node3 (primary), node4 in one shard I brought seco

  • How can I manage changing data structures in MongoDb collections w/ Simple.Data? 2011-09-17

    We're currently using Simple.Data and the MongoDb adapter. When we've retrieved a document, we cast it into a POCO, e.g: (User)db.Users.FindById(1234); To begin with, this works quite well (heck yeah, no schema!). However, if we change the structure

  • MongoDB: Building complex data structures 2011-12-23

    I've read and searched about MongoDB's JSON-BSON constructions but I do not understand (could not find either) how to have nested data and how to query it. What I'd like to learn is, if somebody wants to store array within an array as in: id: x, name

  • mongodb FindAndModify - update data 2011-09-07

    I have this already in the MongoDB collections. { "_id" : ObjectId("4e677efce88c7f0718000000"), "ptbn" : "indl000000001", "tbucode" : "5649", "district" : "west", "dcode" : "110048", "tbu" : "super clinic", "state" : "delhi" } I am unable to understa

  • Writing single Hadoop map reduce output into multiple S3 objects 2009-12-04

    I am implementing a Hadoop Map reduce job that needs to create output in multiple S3 objects. Hadoop itself creates only a single output file (an S3 object) but I need to partition the output into multiple files. How do I achieve this? --------------

  • Outputting data from unit test in python 2008-11-12

    If I'm writing unit tests in python (using the unittest module), is it possible to output data from a failed test, so I can examine it to help deduce what caused the error? I am aware of the ability to create a customized message, which can carry som

  • Output data with no column headings powershell 2009-09-10

    I want to be able to output data from Powershell without any column headings. I know I can hide the column heading using Format-Table -HideTableHeaders, but that leaves a blank line at the top. Here is my example: get-qadgroupmember 'Domain Admins' |

  • php: Grabbing stdout output data from cli tools? 2010-01-22

    Is it possible to grab the stdout output data from command line tools in php? Example: I want to upload a dynamically server-created mix of audio files to the client. The SOX tool lets me mix the input mp3s and send the result to stdout pipe. Could I

  • Why is MySQL it outputting data from, effectively, two different queries? 2010-02-24

    Gday All, I am trying to get the details of the first ever transaction for a given customer within a period of time. Consider the following: SELECT MIN(t.transaction_date_start), t.* FROM transactions t WHERE t.customer_id IN (1,2,3) AND t.transactio

  • All things equal what is the fastest way to output data to disk in C++? 2010-03-04

    I am running simulation code that is largely bound by CPU speed. I am not interested in pushing data in/out to a user interface, simply saving it to disk as it is computed. What would be the fastest solution that would reduce overhead? iostreams? pri

  • What to do if I have a CGI that runs for several minutes before outputting data, and Apache times it out? 2010-08-25

    I have a CGI script that takes a really long time to execute. Long story short, it needs to process a lot of data, run a bunch of slow commands, and make some slow web queries, during which time it doesn't output anything, and when it's done, it fina

  • How to plot input and output data in MATLAB 2010-11-05

    I have a 2 Dimensional input data; a set of vector with 2 components, let's say 200. And for each one of those I have a scalar value given to them. So it's basically something like this: { [input1(i) input2(i)] , output(i) } where i goes from 1 to 20

  • Display outputted data in my Form so i can edit 2010-12-09

    How can i display the outputted data that is (contents.html) to go in my form textarea (edit.html). I am using a JS HTML WYSIWYG editor (TinyMCE) in the form page to make it easier for people with no HTML experience to make edits. (i am aware of XSS

  • Having Trouble Outputting Data correctly From SQL Table 2011-01-13

    I am trying to outputting data from a SQL table Table cols are: sheduleID, userID, empID, timeSlot, WeekSlot, daySlot Connecting to DB $schedQ = "SELECT * FROM seo_schedule WHERE empID=1 AND weekSlot=1"; $Em1Wk1Res = mysql_query($schedQ) or die(mysql

  • Outputting data with JQuery & PHP 2011-05-24

    I have a general question regarding quality of writing code ... I'm working on website, that outputs data from MySQL with PHP and it is being called with $.get() / $.ajax() with jQuery.... Now my question is .. the data I´m exporting for example: an

  • How can I arrange Input and Output data from Excel sheet in Neural Network? 2011-06-15

    Total 12-sets of inputs of FEM are to be used as Neural Network input. Each set of input data contain 17 nos. of FEM input data. Only three variables change among 17 nos. of input variables, rest of variables remain unchanged. Input data are such as

Copyright (C) pcaskme.com, All Rights Reserved.

processed in 1.119 (s). 13 q(s)