01:31
AndroidLoverInSF joined
01:57
joaojeronimo joined
02:28
dingomanatee joined
03:17
AndroidLoverInSF joined
03:28
LujeniSchool joined
03:57
<senorpedro>
is this possible in mongodb: "select * from ship where ship_id not in (Select ship_id from cruise)"?
04:10
gustavonalle joined
04:11
<j4kes>
Can I add a server to a replication set from one of the secondaries? My though on this is no, due to the fact that the replica set information is stored in a collection and a secondary does not have write access.
04:12
<j4kes>
So how do we handle auto deployment via some tool like chef? Should we do a look up for the master, and then connect to the master for adding a new node?
04:38
<nacer_>
what's the Document Size maximum on last version of mongodb
04:59
<kelye>
hello, what's the final format for mongo date?
05:00
<nacer>
MacYET: don't seem a updated value
05:00
<kelye>
i want to format a string as 'date' but the system where i'm building the input doesn't have any mongo driver so i have to build the date in the final format
05:01
<nacer>
MacYET: i have put a 30M file in a document
05:01
<MacYET>
your problem
05:01
<MacYET>
use gridfa
05:01
<MacYET>
use gridfs
05:01
<milkshak1s>
kelye: http://en.wikipedia.org/wiki/ISO_8601
05:02
<kelye>
milkshak1s, NodeX thanks!
05:02
<milkshak1s>
pleasure
05:02
<milkshak1s>
good luck
05:03
<nacer>
can we build from source and have the mms support inside server core
05:34
tomasztomczyk joined
05:35
sjaak_trekhaak joined
05:35
<sjaak_trekhaak>
Is it possible to convert a field to MD5 in-query?
05:36
<sjaak_trekhaak>
eg db.things.find({hex_md5('name'): 'blablalbla'})
05:42
<NodeX>
I dont think Javascript has a native md5 funciton
05:43
<sjaak_trekhaak>
mongo client does
05:43
<sjaak_trekhaak>
> hex_md5('bla')
05:43
<sjaak_trekhaak>
128ecf542a35ac5270a87dc740918404
05:49
<Zelest>
if I use the PHP driver, is there any way to decide which server to read from in a replicaset? Or does it automatically pick the one with lowest ping?
05:50
<NodeX>
there is a pool iirc
05:50
<MacYET>
which has nothing to do with # of least connections
05:51
<NodeX>
mongodb://[username:password@]host1[:port1][,host2[:port2:],...]/db
05:51
<NodeX>
from the php docs ^^
05:54
<sjaak_trekhaak>
NodeX: db.things.find({ $where : function() { return hex_md5(obj.name) == 'somemd5string
05:54
<sjaak_trekhaak>
'}});
05:54
<sjaak_trekhaak>
Does the trick
05:55
<Zelest>
NodeX, well, in order to have the failover working, I connect to all 3 nodes.. and it will read/write to the master unless i setSlaveOkay() .. then it reads from the slaves and writes to the master.. however, I have a slave that is located quite far away and would prefer to only read from the "local" slave.
05:56
<NodeX>
I'm not sure you can do it that way but you can add a flag to shards for edge cases iirc in the later mongod's
05:58
<nacer>
Zelest: i think you can add some datacenter awerness in your replicat
05:59
<Zelest>
doesn't that only apply to writes where you want to force replication upon writes? :o
05:59
<nacer>
NodeX: oups :)
05:59
<NodeX>
I have never used it but i would think it's bidirectional
06:00
<NodeX>
reads/writes go over the same line ... it's about the network speed / locaiton rather than the physical server / capacity
06:05
<Zelest>
seems to be fairly sorted out of the box.. seeing it uses the slave with lowest ping anyways.
06:07
<Zelest>
Derick, around?
06:07
<Derick>
what's the question?
06:07
<Zelest>
is it "safe" to just bump the version numbers in the freebsd ports from 1.2.7 to 1.2.9? (pecl-mongo) ?
06:08
<Derick>
it's fixing bugs...
06:08
<Zelest>
ah, so the build process and all is still the same?
06:08
<Zelest>
the ports maintainer is slacky.. and I want 1.2.9 now ;)
06:17
gustavonalle_ joined
06:20
<Killerguy>
I have a little pb when moving all my chunk to another shard
06:20
<Killerguy>
I did a removed shard, and now it's on draining mode
06:21
<Killerguy>
but I got thoses errors on my config server :
06:21
<Killerguy>
[conn31] assertion 10057 unauthorized db:config lock type:-1
06:22
<Killerguy>
and this on my mongos :
06:22
<Killerguy>
[Balancer] moveChunk result: { assertion: "field not found, expected type 2", assertionCode: 13111, errmsg: "db assertion failure", ok: 0.0 }
06:22
<Killerguy>
all my cluster is running with a keyfile, the same for all
06:27
beawesomeinstead joined
06:30
<Zelest>
how can I see what my oplog collection is capped to? i tries db.oplog.rs.stats() .. not sure what to look for? max? storageSize? size? :P
06:30
<Zelest>
or is there any other way of getting this value?
06:32
<Zelest>
ugh, the mongo CLI client seriously needs to make ctrl+w act normally :P
06:32
<Zelest>
(e.g, remove the last word on the line)
06:33
niftylettuce joined
06:39
jab416171|Cloud joined
06:43
<Zelest>
Derick, regarding getHosts() .. my arbiter seems to have health 0.. even though rs.status() says it's 1.. :o
06:43
<janimo`>
is there a way of only working on the bson subdir and checking bson tests? I'd rather not wait for the whole mongo tree to rebuild each time
06:44
<janimo`>
or is there a better -dev channel to ask codebase related questions?
06:46
<MacYET>
only this #
06:46
<MacYET>
or ask on the -dev list
06:46
<janimo`>
MacYET, thanks
06:47
patricksroberts joined
06:47
<janimo`>
found bsondemo.cpp can be built easily, enough for now I guess
06:50
<Killerguy>
anybody have I idea about my pb?
06:50
<Derick>
Zelest: hmm
06:51
<Zelest>
Derick, first i thought that arbitary nodes doesn't bother with health, as they have no data anyways.. but rs.status() seems to use it, hence my question. :-)
06:52
<Derick>
yeah, that should work and show the same info
07:04
<Zelest>
how is mongodb funded btw? does 10gen make tons of money elsewhere and just make mongo for shits and giggles.. or is it funded by donations from the community or so?
07:04
<Zelest>
i guess the correct question is, is it possible to donate money to mongodb? :)
07:04
<Zelest>
and/or 10gen.
07:04
<_ollie>
I think there has been a significant VC investment recently
07:04
<Derick>
support and trainings mostly IIRC
07:04
<MacYET>
they are burning venture capital
07:17
<amr>
is there an easy way to find out if i've got an object in a collection already?
07:18
<amr>
i.e. exactly the same
07:19
<MacYET>
check for its _id
07:19
<amr>
hmm good point
07:20
<amr>
oh actually, this is for objects that have the same information, but have been inserted twice blindly (stupid i know)
07:20
<amr>
is there a query to find duplicates?
07:20
<MacYET>
create a unique index
07:21
<amr>
can you have compound indices? no one field is guaranteed to be unique
07:21
<amr>
but a pair of fields (symbol,date) should always be unique
07:21
<MacYET>
i don't care - ensure that something is unique.....
07:22
<amr>
i dont care if you care, it was a question
07:22
<MacYET>
it works exactly as in a rdbms
07:23
<amr>
get off your high horse
07:23
<amr>
thanks anyway
07:24
<MacYET>
try it with a pony
07:49
<remonvv>
amr, you can create compound indexes with a unique constraint
07:49
<remonvv>
In such cases the combination has to be unique.
07:50
<amr>
yeah that's the sort of thing i need
07:50
<amr>
just realisd my data importer is formatting dates incorrectly, bugger
07:51
<amr>
not my brightest moment
07:51
<remonvv>
If you do that it'll notify you if combinatorial duplicates. If you add dropDups=true it will automatically (but randomly) drop duplicates that do not meet the constraint.
07:51
<remonvv>
We all have such moments kind sir ;)
07:51
<remonvv>
I famously swapped a < for a > and lost a client 16,000 euros.
07:51
<amr>
how'd a character swap end up costing so much?
07:52
<remonvv>
Well, think internet bubble time. I was an intern at a small internet company that somehow ended up making kiosk gambling games. At some point the client (illegally) wanted to add a check that if the machine was giving out more than X% of what was put in it should start cheating and not pay out as much.
07:53
<remonvv>
So I..as an intern...wrote that algorithm.
07:53
<remonvv>
And testing in those days was considered a waste of time. It was by my boss anyway.
07:53
<remonvv>
Super shady.
07:53
<amr>
sounds fun though :)
07:54
<remonvv>
Anyway, so it was something like cheat(paidOut * MAX_PAYOUT < in)
07:54
<remonvv>
and the cheat function had the epic mistake of that boolean parameter
07:54
<remonvv>
false = intentionally lose, true = intentionally win
07:54
<remonvv>
there you go
07:54
<remonvv>
swapped em
07:55
<remonvv>
Huge fan of testing and TTD these days.
07:55
<remonvv>
Want to hear the ironic tail of that story?
07:55
<remonvv>
Right, so those machines were deployed for a weekend right. Obviously client was pissed and whatnot. But the buzz around the gambling people apparently resulting in huge lines in front of their casino place the next week or so.
07:56
<remonvv>
They easily compensated for the 16k, even though the machines were fixed overnight (by a certain overly apologetic intern)
07:56
<amr>
damn, that's one way to drive business
07:56
<amr>
sort of loss-leading, but not quite
07:57
<remonvv>
Yes, and rather illegal. They've been out of business for a decade so I can say now ;)
07:57
<remonvv>
There's tons of regulations regarding that stuff that they should've followed, but didn't
07:57
<remonvv>
Was fun though.
07:57
<remonvv>
So yeah, unique indexes, great.
08:00
<amr>
i seem to be butchering the crap out of my dates
08:11
<skylamer`>
how to make when saving a doc with same duplicate values to just update the existing one, and not add a new document object? :)
08:12
corruptmemory joined
08:13
<amr>
i think you may want to try using upsert
08:19
<amr>
OperationFailure: cursor id '7379405810946430774' not valid at server
08:19
<amr>
well that was unexpecte
08:20
<amr>
i see that happens when theres been a long time between ops, anyway to prevent that?
08:20
<amr>
there could be up to a 14 second delay between inserts
08:21
<amr>
timeout=False for finds, but im not using a find in that script?
08:54
<benbro>
aggregation fits real time queries or more appropriate for complex reporting?
09:02
<amr>
what units is fileSize in for db.stats() ?
09:02
<amr>
hmm, ive hit a gb and im about 15% done with my import
09:23
<amr>
this cant be right:
09:23
<amr>
"fileSize" : 1006632960,
09:23
<amr>
post-compact:
09:23
<amr>
"fileSize" : 2080374784,
09:23
<amr>
it's ~doubled in size?
09:24
<remonvv>
YOU'RE DOING IT WRONG!
09:24
<remonvv>
I mean, that's odd
09:24
<amr>
i deleted a lot of data, so repaired/compacted as i read was advised
09:25
<amr>
ive just run it again, and im back to the original number
09:25
<amr>
seems rather large for ~1mill objects
09:27
<amr>
could be the indexes?
09:27
<amr>
indexSize is 34142976
09:33
sjaak_trekhaak joined
09:38
<remonvv>
fileSize is not a very good measure
09:39
<amr>
thats reassuring
09:40
<remonvv>
Well, you know how MongoDB allocates it's files right? 64Kb->128Kb->..->1Gb->2Gb->2Gb->2Gb
09:40
<remonvv>
or 1/4th of those sizes with --smallfiles
09:40
<amr>
when you hit 2gb it creates another 2gb one right?
09:41
<amr>
if i do totalSize() i get like 250mb, so im not sure why fileSize reports so much
09:43
sjaak_trekhaak joined
09:43
<amr>
two collections: 211225840 and 3046928 from totalSize
09:43
<remonvv>
Well, if filesize is 2Gb it just reserved the 1Gb file. If the sum of data, indexes and journal is significantly lower than 1Gb it is odd.
09:43
<amr>
fileSize is 1006632960
09:44
<remonvv>
Okay, so it just created the 512Mb file
09:44
<remonvv>
So the sum should be over 512Mb.
09:44
<remonvv>
(64Kb+128Kb+...+128Mb+256Mb = ~512
09:44
<amr>
so filesize is about 3.5x the two totalSize
09:45
<amr>
how do you mean?
09:45
<remonvv>
Yes, but the filesize will remain the same while your totalSize will increase until it creates a new data file.
09:45
<amr>
ohh, i think i understand
09:45
<remonvv>
It preallocated the file right. So even if you use 1 byte beyond what can fit in the current data files it'll allocate a new one.
09:45
<remonvv>
And the new one might be a fresh 2Gb
09:46
<amr>
i'll just let it sort itself out :-)
09:46
<remonvv>
As such, fileSize is not a good measure for the size of your data unless you have many GB of data in which case the difference is small.
09:46
<remonvv>
You do that ;)
09:46
<amr>
well when the import finishes i might have a tinker
09:47
<remonvv>
Tinkering is good. The only field I really look at is indexSize and paddingFactor
09:47
<amr>
i shall bear that in mind
09:48
<remonvv>
of the collection level stats() call, that is
09:48
<remonvv>
paddingFactor tells you how much mongo is reserving for future document growth per document
09:48
<remonvv>
If that's high it's wasting a lot of space, if it's very low it'll mean your updates will usually force MongoDB to physically move your document.
09:48
<treamer_>
> NumberLong(5)+NumberLong(3) -> NaN
09:48
<treamer_>
how can i add two longs?
09:50
<remonvv>
In shell? you don't
09:50
<remonvv>
NumberLong is a wrapper type so it ends up as a 64-bit integer BSON value.
09:50
<treamer_>
in no way? so for 2 lines i must use java or some else driver?
09:50
<remonvv>
Shell treats all numbers as floats. I don't know why.
09:51
<remonvv>
Not that I know of. JavaScript doesn't have 64-bit signed integer types
09:52
<remonvv>
The number type is 64-bit float with 53-bit guaranteed integer precision (meaning, beyond that you get rounding errors due to the nature of floating point storage)
09:54
<treamer_>
why do not they do some functions to manipulate longs
09:54
<remonvv>
Ask the JavaScript designers.
09:54
<remonvv>
Most dynamically typed languages have this problem, by the way.
09:54
<remonvv>
Or at least some variant of this problem.
09:54
<remonvv>
They're of the "a number's a number's a number" school of thought.
09:55
<remonvv>
Which, if you allow me to comment, is rather dumb.
09:56
<treamer_>
i know, but one can write some workaround, just bunch of functions, like BigDecimal in java
09:56
<remonvv>
Ruby is pretty elegant in that it's dynamic typing works.
09:56
<remonvv>
Well, you can do that ofcourse.
09:56
<remonvv>
I think there's a few libraries that allow that
09:57
<treamer_>
and the only way to extract data from numberlong is to parse its string representation?
09:57
<remonvv>
or use the floatApprox field
09:57
<* treamer_>
created new class in IDEA
09:57
<treamer_>
so i'll write in in java
09:57
<remonvv>
e.g. NumberLong(6).floatApprox + NumberLong(5).floatApprox
09:58
<remonvv>
Approx because it can't represent all long values, only the first 53 bits of one, as mentioned.
09:58
<remonvv>
If your values are guaranteed to be lower than that, the approach above is pretty safe
10:00
<remonvv>
Well I don't want to get into a language discussion but yeah, use Java ;)
10:08
<sina_>
can we join two collections in mongodb?
10:09
<treamer_>
a.foreach(x) {db.b.save(x)}
10:10
<treamer_>
works pretty slow, but not sure any other way
10:14
<horseT>
I have issue to search an object :
10:14
<horseT>
{ "_id" : ObjectId("4f4b9b1917269cca15000000"), "name" : "pol", "sso" : { "site1" : [ 1, 2, 3, 4, 5 ], "site2" : [ 10, 20, 30, 40, 50 ], "site3" : [ 100, 200, 300, 400, 500 ] } }
10:15
<horseT>
how to get object where {"sso.*" :" 1"}
10:23
<NodeX>
I dont think you can
10:23
<NodeX>
unless they are one large array of values
10:24
<majoh>
Hi, so I got an almost full disk, will it work if I stop mongodb and move all database files to a bigger disk (mounted at the old path)? The alternative seems to be to do some sharding but our main mongodb guy is on vacation... 0_o
10:25
<Derick>
majoh: yes, that should work (providing you take care of file permissions)
10:25
<majoh>
Derick: great
11:05
napperjabber joined
11:08
<remonvv>
"our main mongodb guy"..damn him!
11:24
<esseks>
Is it ok to have unix mongodb user with nologin shell?
11:25
<esseks>
wereHamster: thanks!
11:27
<amr>
definitely need to work on on my indexing
11:27
<amr>
taking ~4 secs to return from a find now
11:28
<amr>
is there a way of creating nicer dates in mongodb with pymongo? i get that you cant persist a datetime object
11:28
<amr>
oh, datetime.datetime can be persisted it seems
11:30
<MacYET>
as documented, yes
11:32
<amr>
thanks snarky, i quite clearly discovered that :-)
11:32
<NodeX>
amr : how many docs do you have
11:32
<NodeX>
to have to wait 4 secs on a find
11:41
<NodeX>
what kind of server is that on?
11:41
<amr>
my laptop, but im querying by dates + strings
11:41
<amr>
i assume the string part of a find() is slow?
11:42
<NodeX>
regex or not
11:42
<NodeX>
should be ok as long as it's indexed
11:42
<amr>
maybe ill work on my indexing
11:42
<NodeX>
ensureIndex({date:1,stringField:1});
11:43
<NodeX>
if you are searching them together
11:43
<NodeX>
date being your datfield
11:43
<NodeX>
date field *
11:44
<remonvv>
Indexing like a baws.
11:44
<remonvv>
3.3m docs should never result in query times higher than a few ms for simple equality filters.
11:44
<remonvv>
explain() is your friend.
11:44
<NodeX>
on a laptop though .. slow drive no doubt
11:45
<amr>
its an air, so SSD
11:45
<amr>
so shouldn't be slow
11:45
<NodeX>
how big are your documents?
11:45
<amr>
7 items in a dict
11:45
<amr>
all numbers, one string
11:45
<NodeX>
i mean size in kb or w/e
11:46
<NodeX>
less than 1k
11:46
<NodeX>
yer, you need some indexes on them lol
11:46
<NodeX>
I have a 43million doc collection on one server and I can query a geo spatial search with a radius in less than 10ms
11:47
<amr>
how large is that db?
11:47
<amr>
in file size terms
11:47
<NodeX>
I havn't checked it's size recently
11:47
<amr>
should i wait for some imports to finish before doing ensureIndex?
11:48
<_ollie>
MacYET: surprised not to see you speaking at Mongo Berlin this year…
11:48
<NodeX>
I had a 15million row collection which I forgot to index a field and was looking up previous searches on .. it hung my whole web app for 4s while it looked it up
11:48
<NodeX>
as soon as I indexed the web app went back to being fast
11:48
<NodeX>
index took about 3 minutes
11:49
<remonvv>
It just should not ever be slow when it directly hits an index. b-tree lookups are log(2N)
11:49
<NodeX>
I personaly setup the indexes when I make the collection.. but that's just personal preference
11:49
<amr>
i should probably do that
11:49
<NodeX>
you can set background:true if you dont waqnt to wait
11:49
<remonvv>
MacYET speaks in public?
11:49
<NodeX>
or have a write/read lock
11:50
<amr>
ill just wait
11:50
<NodeX>
might just be a read lock ... can't remember
11:50
<amr>
want to make sure this import finishes this time
11:50
<NodeX>
imports are a biotch on large data
11:50
<remonvv>
import and do a background=false ensureIndex.
11:50
<remonvv>
background index creation is rather unpredictable in terms of execution times.
11:51
<NodeX>
I spent all weekend putting a 2 million document collection into SOLR 10,000 rows at a time only to find out it was about 1000 times faster to do it in one hit with a CSV :/
11:51
<remonvv>
I start question your sanity roughly at the "I spent all weekend" part of that sentence.
11:51
<remonvv>
questioning*
11:52
<NodeX>
it was every 45 mins reset the script
11:52
<NodeX>
increment the counter by 10,000 and start again lol
11:52
<remonvv>
I don't even know where to begin with that story.
11:52
<NodeX>
I needed integrity on the data else I wouldv'e scripted it
11:53
<remonvv>
Don't take this personally, but..
11:53
<NodeX>
laugh all you want lol
11:53
<remonvv>
I am kind sir.
11:53
<NodeX>
I earnt £2,000 for in total about 3 hours work
11:53
<* remonvv>
stops laughing
11:53
<remonvv>
You bastard.
11:54
<NodeX>
and I drank vodka the whole time and played xbox in between script runs lol
11:54
<remonvv>
Yeah, I rather hate you.
11:54
<NodeX>
had to stay up all weekend but it's all good
11:54
<remonvv>
I have to work a lot of hours for 2k
11:54
<NodeX>
I was very annoyed when I worked out I could do it in about 10 minutes with one large CSV late sunday night
11:55
<NodeX>
self employement has it's benefits
11:55
<NodeX>
but no security like a 9-5
11:55
<remonvv>
I can imagine. I bet it feels roughly similar to finding out some people earn 2k in potentially 10 minutes late monday afternoon.
11:56
<NodeX>
it's less than £50 per hour
11:56
<NodeX>
if you were to consult on a project you would charge that no ?
11:56
<remonvv>
A lot more, but you didn't actually spend the entire weekend did you?
11:56
<remonvv>
Or at least, shouldn't have ;)
11:56
<NodeX>
in total no but I had to be by the computer every 40-45 mins
11:57
<NodeX>
and stay up all night to make sure it all ended up done
12:02
<remonvv>
Fine fine, excuse accepted.
12:05
<remonvv>
Anyone ever looked at RavenDB?
12:06
<NodeX>
nope, I heard of it the other day
12:06
<remonvv>
It has the least informative site in existence.
12:06
<remonvv>
"It does A, B and C and does so awesomely!" "Cool, where can I read about it?" "What? you don't believe me?"
12:08
<remonvv>
multi doc transactions, sharding and scalability...hm. Sounds like one of those you can have 2 out of 3 sort of problems.
12:08
<NodeX>
if it does what it says on the tin it woule be good
12:08
<NodeX>
transactions are a plus
12:08
<remonvv>
Are they?
12:08
<remonvv>
Overused, in my opinion.
12:08
<remonvv>
Or overrequired, is perhaps the appropriate statement.
12:09
<remonvv>
But still, how does one linearly scale up multi-doc transactions?
12:09
<remonvv>
You don't, is the answer to that rhetorical question.
12:09
<NodeX>
http api too which is nice
12:09
<remonvv>
And slow.
12:10
<remonvv>
REST API cpu overhead on our systems compared to (in our case) WebSockets is significant.
12:10
<NodeX>
I get awesome speed with SOLR
12:10
<Purdy>
i have an odd mongo question - i have a nested data structure where a value is either an objectid or an array ... yet when i try to use type, both resolve as objects
12:10
<Purdy>
more details here: http://pastebin.com/ByB3ReEZ
12:10
<remonvv>
Speed is relative, we average request times in the 0.02-0.1ms range
12:11
<NodeX>
my whole round trips including reder are about 40ms
12:11
<remonvv>
Oh this is pure REST API, apples and oranges.
12:11
<NodeX>
(xmlhttprequest)
12:12
<remonvv>
Point being, those times are halved if we can bypass HTTP overhead.
12:12
<Purdy>
hmm ... looks like a known issue ... https://jira.mongodb.org/browse/SERVER-1475
12:12
<remonvv>
I'd sign for any site with an average request RTT of 40ms ;)
12:12
<NodeX>
that's why I get paid so much money at weekends lol
12:13
<NodeX>
if a page takes more than 1 sec to load I spend weeks re-writing it
12:13
<remonvv>
Purdy, it's one of those bugs that 10gen doesn't see as one ;)
12:13
<NodeX>
it clearly should be an array
12:14
<remonvv>
Kinda like some indexes changing the results of queries.
12:14
<remonvv>
And query results should be consistent with indexes on or off.
12:14
<Purdy>
heh, yeah, looks that way ... gotta figure a workaround, then
12:14
<remonvv>
Purdy, frankly if you need type checking there's probably something manky about your schema anyway ;)
12:14
<NodeX>
look at it in your App Purdy
12:14
<NodeX>
I make my app take care of all the casting/checking
12:15
<remonvv>
I mean it's a bit questionable that you don't know at query time what type a field is but need to query on it anyway.
12:15
<Purdy>
i'm doing a map reduce and just want to look at sets w/ >1 matches
12:16
<NodeX>
what people normaly do in that situation is store the number of array members
12:16
<NodeX>
due to a lack of a count() / sizeof()
12:16
<Purdy>
*nod* ... working on a finalize method to do that
12:16
<remonvv>
If only you could do {$size:{$gt:1}} huh ;)
12:17
<remonvv>
For some reason $size only accepts absolute values.
12:17
<remonvv>
Which is not that useful
12:17
<remonvv>
NodeX's suggestion is the most common fix/workaround/feature
12:17
<maxamillion>
are there any utilities (official or otherwise) that are popular in benchmarking performance of a mongodb setup/config?
12:17
<remonvv>
Not that I know of. We dev and run our own smoketests.
12:18
<remonvv>
Mostly because we need to simulate specific load patterns.
12:18
<maxamillion>
bummer
12:18
<NodeX>
make a simple script .. takes minutes
12:18
<remonvv>
Well, note, not that *I* know of.
12:18
<remonvv>
I don't believe in generic performance tests. You need application typical workload.
12:19
<maxamillion>
remonvv: right, which makes sense ... but I think its good to have a baseline
12:19
<remonvv>
True, what kind of information are you looking for?
12:19
<remonvv>
I can give you most of the generic numbers ;)
12:19
<remonvv>
Ballpark, anyway
12:20
<maxamillion>
remonvv: well, I'll give some background ... I work for Dell in a R&D lab ... I have almost unlimited hardware at my disposal and the idea is to setup a config, run some numbers, add a node, run the test again, add a node, run the test ... etc. then rinse and repeat by changing different variables in the cluster/config
12:20
pharkmillups joined
12:20
<maxamillion>
remonvv: change OS kernel tuning parameters, change networking options, run vs different filesystems, etc
12:21
<remonvv>
a) awesome job, b) okay
12:21
<maxamillion>
remonvv: we basically just want to profile how mongodb reacts to different things
12:21
<maxamillion>
remonvv: a) yes, I'm quite spoiled :)
12:21
<remonvv>
Well file systems make a big difference
12:21
<remonvv>
But that's well documented
12:21
<NodeX>
I would make a 1tb ram machine with 96 cores and play pong on it if I had that job
12:21
<maxamillion>
right, I've read a little on that topic
12:21
<savant>
NodeX: you would then be fired
12:21
<NodeX>
or download the entire internet
12:22
<NodeX>
just because "i could"
12:22
<maxamillion>
NodeX: I have a few of those in my rack right now ... but they don't play pong ... yet
12:23
<NodeX>
I can give you some numbers from various hardware I run.. I have a couple of 16gb servers, 1 x 128gb ram with 16 cores and some in between
12:23
<remonvv>
Mongo scales linearly for both reads and writes. Having twice the resources on a single node compared to 2 seperate nodes is a noticable difference. mongos (the query router) is relatively cpu heavy and as such there should be quite a few of them, we prefer app server local.
12:23
<NodeX>
as for sharding remonvv is your best bet to ask
12:23
<remonvv>
I don't think kernal tuning is going to get you that much.
12:23
<remonvv>
kernel, even
12:24
<remonvv>
Out of the documented optimizations.
12:24
<remonvv>
Outside of*
12:24
<maxamillion>
remonvv: which makes sense ... this basically boils down to my manager wanting me to investigate, run some tests and show some hard numbers with a profile of how things react based on what I change
12:24
<remonvv>
Fair enough, but there's a lot of possible variables and you can safely ignore the very low level ones.
12:24
<maxamillion>
remonvv: everything is very "conceptual" right now, but mongo has been making enough of a name for itself that I've been asked to look into it
12:25
<maxamillion>
right
12:25
<remonvv>
We run clusters that can hit 200-300k writes/sec sometimes and despite our best efforts fiddling with kernels or even linux configuration doesn't buy us much.
12:27
<remonvv>
The most important variables you should play with are active working set (how much of your data is regularly accessed) and read/write ratio (find sweetspot of number of shards versus number of rep set members per shard)
12:28
<maxamillion>
ah ok, well that's certainly good info to have
12:28
<maxamillion>
many thanks :)
12:29
<remonvv>
Yeah, active working set is a funny problem. You can completely kill performance if you don't have a properly right balanced index.
12:29
<maxamillion>
I will most likely look into https://github.com/brianfrankcooper/YCSB/wiki as a benchmark utility ... it seems to be mildly popular
12:29
<remonvv>
So your setup can look very different if you have to query recent data compared to random data.
12:30
<maxamillion>
right, hot vs. cold data
12:30
<remonvv>
MongoDB is built directly on top of memory mapped files so it'll always use the OS mem management.
12:30
<remonvv>
Which might actually make the OS itself a good variable to fiddle with.
12:31
<remonvv>
Although most work pretty similar.
12:31
corruptmemory joined
12:31
<remonvv>
So, back to relevant world...how did you land that job?
12:32
<maxamillion>
remonvv: luck of the draw honestly .... I've been a Fedora community member and active contributor for a while and I had a linux admin job .... I met some Fedora folks who worked for Dell and asked if they knew anyone who was hiring linux people .... one of them told me to send my resume ... and here I am
12:33
<remonvv>
Life is so unfair. First someone here earns 2k in 10 minutes, and the other lands an R&D job.
12:33
<maxamillion>
remonvv: I've been here for almost a year and a half and its been a blast
12:33
<remonvv>
Well, I wish you the best of luck with your research good man. I shall go and get a drink ;)
12:33
<maxamillion>
2k in 10 minutes?
12:33
<maxamillion>
remonvv: thanks! I really appreciate your time :)
12:34
<maxamillion>
this was very helpful info, I've taken notes :D
12:34
<remonvv>
Yeah. Well, the story is slightly more elaborate than that but that's what it boils down to in my head ;)
12:34
<remonvv>
No problem, keep us posted!
12:34
<remonvv>
I'm really off for that beer now.
12:59
<wad>
Is there a way to make a full copy of a mongo database, to the same cluster? I want to try some operations on a large database, but I need a way to "undo" it if I screw it up.
13:00
<jedir0x>
I have a replicaset with 3 members - the slaves haven't been updated for 3 days it seems - health is 1 and the pingMs is low - but their last update (optimeDate) is from 3 days ago... any ideas?
13:02
<MacYET>
wad: backup & restore
13:20
<giskard>
when i run a db.repairDatabase, how much free space i need to have..
13:20
<giskard>
the same size as the db+1?
13:21
<giskard>
Cannot repair database mytestdb having size: 45025853440 (bytes) because free disk space is: 2493149184 (bytes)\" }"
13:21
<MacYET>
in the worst case, yes
13:21
<giskard>
interesting
13:21
<giskard>
there is no way to force it?
13:21
<giskard>
i deleted all data btw..
13:21
<MacYET>
to force what?
13:22
<giskard>
reclaim space back
13:22
<giskard>
yes.. but repairDatabase() is returning that error
13:22
<MacYET>
then get more diskspace!
13:23
<wad>
MacYET, you suggested backup and restore... but I'm not finding those commands.
13:24
<MacYET>
google "mongodb backup"
13:25
<wad>
I'm still not finding it. It's not a sharded cluster....
13:25
<wad>
I can't do a mongodump (I don't have a filesystem available with enough disk space)
13:27
<wad>
I'm just going to do a copydb
13:30
<kali>
wad: they are command line tools: mongodump / mongorestore
13:30
<wad>
kali: Yeah, those are awesome. But I don't have the diskspace on a local filesystem rigth now to hold that backup. I'm good, just doing a db.copyDatabase right now. :)
13:32
mattbillenstein joined
13:36
<mFacenet>
kchodorow, ping
13:39
<kchodorow>
mFacenet: what's up
13:40
<mFacenet>
kchodorow, do you know of any changes in the mongo driver between 1.1.4 and 1.2.7 that would cause the driver not to release connections to the db itself ( we were using persistent connections before)
13:41
<mFacenet>
one of our datacenters just hosed itself, upgrading php from 5.3.2 and the driver from 1.1.4 to 1.2.7, no code change
13:46
corruptmemory joined
13:52
<sangcn>
Hello everybody
13:53
<sangcn>
I have a problem, i want to store video file which up tp 100mb, so should i choose gridfs?
13:54
<sangcn>
Hope somebody help me!
13:59
<kchodorow>
mFacenet: the connection code changes a lot, but it should work the same as persistent connections in 1.1.4
14:09
<artOfWar>
tried creating index on a collection using https://gist.github.com/2e3315115758db436735 and I get an error insert global_dashboard.system.indexes exception: ns name too long, max size is 128 code:10080 988340ms
14:15
<diminoten>
mongoskin's findById function simply doesn't work, does it?
14:15
scottfalconer joined
14:19
<diminoten>
I really don't get this
14:19
<diminoten>
not even slightly
14:19
<diminoten>
multiple hours of my life, just gone
14:22
<diminoten>
so how do I go from a hex string to a document in mongo
14:22
<diminoten>
seems like the easiest fucking thing in the world
14:22
<diminoten>
why isn't it...
14:23
<diminoten>
db.collection("foo").findOne({"_id":db.bson_serializer.ObjectID.createFromHexString(request.params.id)}, callback);
14:23
<apucacao>
i have two collections: User and Link. Users create Links. Users can also star links. When I retrieve all the Links as an authenticated User, I would like to also know which links were starred by the User in question. So the each Link in the results would have an extra attribute, 'starred', set to true/false. How can I do this?
14:24
<diminoten>
why not db.collection("foo").findById(request.params.id, callback)?
14:24
<diminoten>
seems so fucking basic
14:24
<diminoten>
select * from foo where _id = '1234';
14:24
<diminoten>
it's basic in sql
14:28
<kali>
diminoten: why do you compare SQL with application side JS ? if you get down to the mongo JS shell, thing will get close to what you expect
14:29
<diminoten>
I guess my real beef is the fact that it's not simply 1234 but BSON
14:29
<diminoten>
instead of a simple sequence it's complex
14:30
<diminoten>
and I'm glad it took a comparison to rdbms to get ANYONE to respond in here
14:30
<kali>
you can use string for ids if you prefer
14:30
<diminoten>
not by default I can't
14:31
<diminoten>
admittedly this works: db.foo.find({"_id":ObjectId("1234"});
14:32
<diminoten>
in console anyway
14:32
<diminoten>
or the "monjo js shell" to be specific
14:32
<diminoten>
what really twists my nipples is that I can't replicate this behavior in my js :-/
14:33
<kali>
you may want to try asking the monoskin guys
14:33
<diminoten>
if I knew where I would
14:33
<kali>
i can't help you there
14:33
<skylamer`>
osx is not an os
14:35
<diminoten>
its jsut frustrating because it seems like no one else has this issue, which leads me to believe it really is something stupid I'm doing
14:35
<diminoten>
and yet there literally isn't very much to screw up, its so simple
14:44
<artOfWar>
how to create a named index, I cannot use default naming as I'm hitting the name too long exception
14:57
<apucacao>
i have two collections: User and Link. Users create Links. Users can also star links. When I retrieve all the Links as an authenticated User, I would like to also know which links were starred by the User in question. So the each Link in the results would have an extra attribute, 'starred', set to true/false. How can I do this?
14:58
<apucacao>
I would just apreciate a pointer :)
15:07
<bLiNdRaGe>
i'm having a weird issue
15:08
<bLiNdRaGe>
just upgraded ubuntu to 11.10, installed mongodb-10gen from http://downloads-distro.mongodb.org/repo/ubuntu-upstart/ dist/10gen mongodb-10gen amd64 2.0.2 via apt, did /etc/init.d/mongodb start, and i got this: http://pastebin.com/uP5CVVun
15:09
corruptmemory joined
15:10
<bLiNdRaGe>
fails on mongorestore too with couldn't connect
15:10
<bLiNdRaGe>
myoung@marcyoung:~/site$ start mongodb start: Rejected send message, 1 matched rules; type="method_call", sender=":1.8" (uid=1000 pid=1703 comm="start mongodb ") interface="com.ubuntu.Upstart0_6.Job" member="Start" error name="(unset)" requested_reply="0" destination="com.ubuntu.Upstart" (uid=0 pid=1 comm="/sbin/init ")
15:17
joaojeronimo joined
15:29
<bLiNdRaGe>
i don't understand why this isn't working: http://pastebin.com/1xacKXDb
15:30
<poseid>
having problems to get mongo client talking to mongo server on an ubuntu system
15:31
<bLiNdRaGe>
looks like it's a lock file...not sure where the lock file resides
15:31
<poseid>
mongo tells BackgroundJob starting: ConnectBG
15:31
<poseid>
but afterwards times out
15:32
<poseid>
ok... i search the lock file...
15:32
<poseid>
could be /data/db/mongod.lock
15:33
<bLiNdRaGe>
it's /var/lib/mongod/mongod.lock =)
15:33
<bLiNdRaGe>
i love logs
15:37
<poseid>
in the log: WARNING: You are running in OpenVZ. This is known to be broken!!!
15:37
<poseid>
[initandlisten] MongoDB starting : pid=3989 port=27017 dbpath=/data/db/ 64-bit
15:37
<poseid>
last line: [initandlisten] waiting for connections on port 27017
15:37
<poseid>
and web admin works
15:37
<freezey>
trying to set mongo.native_log
15:37
<freezey>
i am assuming this goes directly into php.ini?
15:38
<hardwire>
hola muchachos
15:39
<hardwire>
there.. I've given you all of my high school spanish.
15:44
<poseid>
hmm trying to reinstall with mongodb18-10gen
15:44
<poseid>
and this service mongod start
15:48
scottfalconer joined
15:55
<hardwire>
anybody have practical experience mixing key types (binary, numeric, string) in an index?
15:56
<hardwire>
I'm only interested in fast lookups and not specifically sorting via the index for this specific index.. I'm guessing everything gets converted to a binary (except for string which is pass through)
15:56
<wad>
hardwire, are you using shell access to the mongo, or the java driver, or something else?
15:56
<hardwire>
whad'ya know?
15:56
<wad>
I'm a mongo newb still, but I've been using the java driver. In that, you just, well, tell it which fields comprise the index. It doesn't seem to care about the types.
15:57
<hardwire>
wad: sure. I was more concerned with how the index was stored rather than how you insert documents.
15:57
<hardwire>
types are the simple part :)
15:58
<hardwire>
I guess I'm going to just throw a few million random fields in a db and find out myself :)
15:58
<wad>
I haven't dug under the covers of the mongo code.
15:58
<hardwire>
oh and then try sharding.
15:58
<wad>
There's nothing better than trying something out.
15:58
<hardwire>
shit.. sharding may make my life hellish
15:58
<wad>
Are you any good at javascript?
15:58
<hardwire>
wad: I'm horrid at javascript.
15:59
<wad>
I've got 140 million documents in a dozen different collections.
15:59
<hardwire>
kshep: hey now I like sharding :)
15:59
<hardwire>
wad: neat
15:59
<hardwire>
whatcha doin?
15:59
<wad>
And I need to convert some of the values from strings to longs and dates.
15:59
<wad>
I need to use db.eval so that it happens all server-side.
16:00
<wad>
The parameter to db.eval is a string, containing javascript code to do what you want to do.
16:00
<wad>
I gotta figure out the javascript.
16:00
<hardwire>
well you can cast types easily enough
16:00
<wad>
casting doesn't persist.... I need to drop all the indexes, change the types, then re-create the indexes.
16:01
<hardwire>
when you change a document it is reindexed isn't it?
16:01
<wad>
Right now all the longs, doubles, and dates are stored as strings.
16:01
<wad>
Even if you change the type of a value it is indexed on??
16:02
<hardwire>
ooh.. depending on your client code you could just make a 'typed' dictionary under the document that contains what you need.. then do a sparse index on that.
16:02
<hardwire>
that way you're rebuilding the index on the fly.
16:02
<hardwire>
and then you can remove the old data using unset
16:02
<hardwire>
and set the new data using set
16:02
<hardwire>
and most likely make the change in place
16:02
<hardwire>
without moving the document to a new disk area
16:02
<hardwire>
should be hella fast.
16:03
<* hardwire>
dislikes evals.
16:03
<hardwire>
I like processing things in batch :)
16:03
<wad>
Still processing what you said.
16:04
<wad>
So, this database is not being used right now.
16:04
<wad>
I can safely delete all the indexes.
16:05
<wad>
I like the idea of using db.eval to do the conversions in-place, because it eliminates any network traffic to the computer running the client code.
16:05
<wad>
I can't imagine a faster way to make the conversions.
16:06
<wad>
I don't think I understood your idea, though. :-/
16:25
<hardwire>
https://gist.github.com/88691e9b9853afa4c559
16:25
<hardwire>
thats right.. it's mixing.
16:25
<hardwire>
20 million keys later.
16:27
napperjabber_ joined
16:28
corruptmemory joined
16:46
napperjabber joined
16:48
<apucacao>
i have two collections: User and Link. Users create Links. Users can also star links. When I retrieve all the Links as an authenticated User, I would like to also know which links were starred by the User in question. So the each Link in the results would have an extra attribute, 'starred', set to true/false. How can I do this?
16:49
<yamagami>
Hi. Is there a pymongo issue with using a field named 'id' in a mongodb document? I'm getting an error "LookupError: unknown encoding: hex" when trying to find documents by a field named "id".
16:49
<apucacao>
is that something that would be done at application level?
16:49
<Adam725>
how would I make a query for items in the field containing double quotes?
16:50
<Adam725>
i.e. something like {field:/\"/}
16:58
<skot1>
yamagami: that is fine, can you post the mongo javascript shell session where you get the error?
16:59
<skot1>
Adam725: yes, does that not work?
16:59
<skot1>
apucacao: yes
17:17
<yamagami>
skot1, Its from python code, not javascript.
17:17
<yamagami>
the error comes from pymongo
17:18
<yamagami>
from javascript i can 'find' on the collection using the 'id' field just fine
17:18
<yamagami>
also the exception is not destructive - I still get back the document
17:18
<yamagami>
though I'm not sure why i'm getting it
17:22
<yamagami>
skot1: http://dpaste.com/708648/
17:33
<surfdue>
Php Operation now in progress
17:33
<surfdue>
how can I prevent this?
17:38
<pr0ton>
how can i increase the in-memory entries for mongodb? i've got 7.5G of ram, but mongostat show only 2.09G of it being used
17:39
<Zelest>
how big is the entire dataset?
17:42
<pr0ton>
how do i find that? (you mean the size in mongo?)
17:43
<pr0ton>
vsize is 4.19G res is 2.1G
17:48
<wad>
I did: db.copyDatabase('old', 'new');
17:48
<wad>
After running for six hours, it's locked up.
17:48
<wad>
It should be about done. I've been checking the number of documents in the collections, and it should be done now.
17:48
<wad>
But the whole cluster is now blocking everything.
17:48
<wad>
You can't even do db.stats()
17:49
<wad>
Any ideas what is wrong?
17:49
<wad>
Or is this normal, and we should just wait it out?
18:09
<nicko>
hey all - is this a good place for newbie questions?
18:13
<nicko>
using a replica set, what happens when the node a client is connected to dies while it is iterating over a cursor? Can the client reconnect to another server and complete the iteration without starting over?
18:27
<Tobsn>
is it normal that the config server sends ~20 requests every 4 seconds back and forth to the shards?
18:28
<Tobsn>
sniffer says "why: "doing balance round"
18:46
<diminoten>
I just instigated the creation of a million rows in a mongodb on my laptop... how bad of an idea was this?
18:47
<diminoten>
fairly small docs, about 12 values each
18:47
<diminoten>
the exponential growth of the db files is cool to watch though
18:48
<diminoten>
oh wtf, it cut off at 759461
18:48
<diminoten>
oh nevermind
18:49
<diminoten>
that's weird, I did a count() immediately upon completion and it gave an incorrect value
18:49
<diminoten>
future counts yield accurate reports however
19:17
<wad>
Okay, time for the Really Simple Question Of The Day.
19:17
<wad>
I'm writing some javascript code to modify a document.
19:17
<wad>
{ $inc : { field : value } }
19:17
<wad>
The field is a string that contains a number. I need to replace it with the number.
19:18
<wad>
{ $inc : { my_field_key_name : parseInt(VALUE); } }
19:18
<wad>
What do I put in for the VALUE?
19:18
<wad>
I need some sort of function like: getValueForKey(my_field_key_name)
19:18
<wad>
But surely there is a really obvious, simple way to do this....
19:21
GabrielVieira joined
20:01
<Tobsn>
ah already fone
21:04
johnanderson joined
21:10
<skot1>
Tobsn: a balancer round is always being done even if these is no action to be taken.
21:11
<skot1>
nicko: the client must catch the excection and retry.
21:12
<skot1>
diminoten: depending on how you are doing the inserts from your client it might not have been done reading the writes out of the network buffers.
21:12
<skot1>
if you don't do safe writes then it will just push lots onto the network until it backs up.
21:19
<laner>
i have a general administration question.
21:20
<laner>
I am setting a up a new machine (ubuntu) and am trying to figure out how to wrap the numactrl's in /etc/init.d/mongodb
21:20
<laner>
any pointers?
21:22
mattbillenstein joined
21:43
corruptmemory joined
21:54
<skot1>
just change the startup command in the upstart job
21:55
<skot1>
if you search the mongodb-user google group there are people who have posted examples.