Page 1 of 2

Linux @ 100% of 1 core

Posted: Sat Apr 26, 2014 5:30 pm
by John Adams
Not sure if this is related to my spawns, or Xen's "going back to a chunk you just left" but I'm watching a player move around my world. Initially it was fine, but he changed chunks, then almost immediately did again, and now World seems hosed.
pegged.jpg
It does not seem to come back from this gracefully, at least not yet.

This VM has 4 cores, so it's really only 25% slammed, but "top" is dumb. Or maybe I am because I don't how how to tell it to consider multi-core.

Re: Linux @ 100% of 1 core

Posted: Sat Apr 26, 2014 5:41 pm
by John Adams
I have to say I haven't seen this before... running world in gdb, it crashed, and disappeared completely

Code: Select all

[Thread 0x7fffcd7fa700 (LWP 4810) exited]
[Thread 0x7fffcdffb700 (LWP 4809) exited]
[Thread 0x7fffce7fc700 (LWP 4808) exited]
[Thread 0x7fffceffd700 (LWP 4807) exited]
[Thread 0x7fffcf7fe700 (LWP 4806) exited]
[Thread 0x7fffcffff700 (LWP 4805) exited]
[Thread 0x7fffecff9700 (LWP 4385) exited]
[Thread 0x7fffed7fa700 (LWP 4384) exited]
[Thread 0x7fffedffb700 (LWP 4383) exited]
[Thread 0x7fffee7fc700 (LWP 4382) exited]
[Thread 0x7fffeeffd700 (LWP 4173) exited]
[Thread 0x7fffef7fe700 (LWP 4172) exited]
[Thread 0x7fffeffff700 (LWP 3934) exited]
[Thread 0x7ffff4850700 (LWP 3933) exited]
[Thread 0x7ffff5051700 (LWP 3932) exited]
[Thread 0x7ffff5852700 (LWP 3931) exited]
[Thread 0x7ffff6681700 (LWP 3930) exited]

Program terminated with signal SIGKILL, Killed.
The program no longer exists.
(gdb) bt
No stack.
(gdb)
The player in question above generated some interesting logging too (attached)

Re: Linux @ 100% of 1 core

Posted: Mon Apr 28, 2014 5:29 am
by theFoof
Yeah I've seen that second crash before, it seems like the client flips out sometimes when you fail to zone and that's the result.

Re: Linux @ 100% of 1 core

Posted: Mon Apr 28, 2014 6:31 am
by John Adams
And again.
04:29:10.114 I Chunk    New zone (-28, 24)
04:29:10.135 I Chunk    Received zone request!
04:29:10.135 I Chunk    New zone (-28, 24)
04:29:10.156 I Chunk    Received zone request!
04:29:10.156 I Chunk    New zone (-28, 24)
04:29:10.176 I Chunk    Received zone request!
04:29:10.176 I Chunk    New zone (-28, 24)
04:29:10.197 I Chunk    Received zone request!
04:29:10.197 I Chunk    New zone (-28, 24)
04:29:10.218 I Chunk    Received zone request!
04:29:10.218 I Chunk    New zone (-28, 24)
04:29:10.295 I Chunk    Received zone request!
04:29:10.295 I Chunk    New zone (-28, 24)
04:29:10.311 E UDP      Packet to client timeout after 300 milliseconds (3 tries, 100 milliseconds each)
04:29:10.311 E UDP      Either there is a lot of packet loss or the client is not acking this packet
0000: 00 0D 00 3A 72 23 2B 0B - 23 03 11 7A 73 2B 6B 0B  ...:r#+.#..zs+k.
0010: 3B 2B 03 A0 02 7C 00 00 - 00 F4 2E 00 00 A8 00 00  ;+...|..........
0020: 00 50 03 C8 00 00 00 50 - 9C 00 00 0C 00 00 00 D8  .P.....P........
0030: 10 00 00 10 F0 10 00 00 - 00 77 00 00 00 C0 BE 0B  .........w......
0040: 00 00 20 87 C1 19 00 00 - 10 9B C6 00 90 00 00 00  .. .............
0050: 00 00 00 00 20 00 00 00 - 00 00 00 00 60 D5 6B 00  .... .......`.k.
0060: 00 00 00 00 32 42 BF 59 - 0C D5 4D 0D DD F0 DF 01  ....2B.Y..M.....
0070: 00 20 00 00 C1 00 00 00 - 00 00 00 00 B0 F1 6B 00  . ............k.
0080: 00 00 00 00 48 08 E6 F6 - FF 7F 00 00 68 75 6E 6B  ....H.......hunk
0090: 20 20 20 20 52 65 63 65 - 69 76 65 64 20 61 63 74      Received act
00A0: 6F 72 20 62 69 74 73 74 - 72 65 61 6D 20 66 72 6F  or bitstream fro
00B0: 6D 20 63 6C 69 65 6E 74 - 20 62 75 74 20 64 6F 6E  m client but don
00C0: 27 74 20 6B 6E 6F 77 20 - 68 6F 77 20 74 6F 20 68  't know how to h
00D0: 61 6E 64 6C 65 20 69 74 - 2E 2E 2E 79 65 74 2E 00  andle it...yet..
00E0: 02 84 00 00 00 E8 5D 00 - 00 70 00 00 00 50 03 C8  ......]..p...P..
00F0: 00 00 00 82 55 04 00 0C - 00 00 00 D8 10 02 00 00  ....U...........
0100: 70 19 00 00 00 B7 00 00 - 00 40 BA EA              p........@..
04:29:10.316 I Chunk    Received zone request!
04:29:10.316 I Chunk    New zone (-28, 24)
[Thread 0x7fffedffb700 (LWP 8325) exited]
[Thread 0x7fffee7fc700 (LWP 8324) exited]
[Thread 0x7fffeeffd700 (LWP 8323) exited]
[Thread 0x7fffef7fe700 (LWP 8322) exited]
[Thread 0x7fffeffff700 (LWP 8136) exited]
[Thread 0x7ffff4850700 (LWP 8135) exited]
[Thread 0x7ffff5051700 (LWP 8134) exited]
[Thread 0x7ffff5852700 (LWP 8133) exited]
[Thread 0x7ffff6681700 (LWP 8132) exited]
(gdb) bt
No stack.
(gdb)
wtf is up with "no stack"? How is this possible?

Re: Linux @ 100% of 1 core

Posted: Mon Apr 28, 2014 6:37 am
by theFoof
When I've seen this it was an out of memory exception, so maybe that's just what linux does in that situation dunno.

Re: Linux @ 100% of 1 core

Posted: Mon Apr 28, 2014 4:40 pm
by John Adams
Services back on Windows. Sorry Linux.

I just ran through a dozen zones, crossing back into previous ones from Southwatch to Falgarhom and not only is the NPC loading blazing fast on Windows, there are no memory issues or crashing (even running in Debug x86). World memory footprint is about 4.2MB with all those zones loaded and Login taking a whole 1.8MB or RAM


Scatnap... here's a 10s video of the world console almost an hour after I was in the game, zoning my butt off, until a 2nd client connected (way back in Southwatch, nowhere near where I am).

http://vgoemulator.net/images/spam.mp4

It continued like that while I ran nearly across an entire chunk, and now it would seem the 2nd player is doing the same - running like hell to somewhere. Whereas I just got the "don't know how to handle it... yet" loggers, he's getting 1 of those and a 0x1 packet dump, then another one, and packet dump.

NOTE: Memory is also 10.6MB now after the 2nd client joined.

Edit: now Windows is doing the same thing Linux did, so far 1 core pegged, or 25% of 4 cores are in a race condition or something.
25-percent.jpg
Edit2: Now the explosion happens, this thing is now in a death spiral.

http://vgoemulator.net/images/explosion.mp4

Re: Linux @ 100% of 1 core

Posted: Tue Apr 29, 2014 3:45 pm
by John Adams
Windows finally burped out an error before crashing, although I do not think this is the cause of the issue, it is another effect of it.

[quote]0000: 00 09 00 D4 00 00 B1 9F - 25 00 01 00 00 00 6C E0 ........%.....l.
11:14:18.962 E UDP Packet to client timeout after 300 milliseconds (3 tries, 100 milliseconds each)
11:14:18.962 E UDP Either there is a lot of packet loss or the client is not acking this packet
0000: 00 09 00 D5 00 00 8B A7 - 25 00 01 00 00 00 17 A0 ........%.......
Out of memory trying to allocate log buffer (level=0, file='c:\vgoemu\src\common\socketdata.cpp', function='SocketData::SetData', line=63)
Out of memory trying to allocate log buffer (level=0, file='c:\vgoemu\src\common\udpserver.cpp', function='UDPServer::SendAck', line=774)
Out of memory trying to allocate log buffer (level=0, file='c:\vgoemu\src\common\udpserver.cpp', function='UDPServer::SendAck', line=774)
Out of memory trying to allocate log buffer (level=0, file='c:\vgoemu\src\common\datastruct.cpp', function='DataStruct::Copy', line=1137)
Out of memory trying to allocate log buffer (level=0, file='c:\vgoemu\src\common\udpserver.cpp', function='UDPServer::SendAck', line=774)
Out of memory trying to allocate log buffer (level=0, file='c:\vgoemu\src\common\udpserver.cpp', function='UDPServer::SendAck', line=774)
Out of memory trying to allocate 155 bytes for log messageOut of memory trying to allocate 142 bytes for log messageOut of memory trying to allocate log buffer
(level=0, file='c:\vgoemu\src\common\udpserver.cpp', function='UDPServer::SendAck', line=774)
Out of memory trying to allocate log buffer (level=0, file='c:\vgoemu\src\common\udpserver.cpp', function='UDPServer::SendAck', line=774)
[/quote]
This was after the runaway train already ran away for who knows how long.

Re: Linux @ 100% of 1 core

Posted: Tue Apr 29, 2014 5:08 pm
by theFoof
Any idea what commit this started? (The 100% usage part)

Re: Linux @ 100% of 1 core

Posted: Wed Apr 30, 2014 2:13 pm
by John Adams
It's done that since World was born. Scatman fixed one of them (I think there was another post, or SVN log at least). Once Xen allowed us to start changing chunks is when it came back with a vengeance. Add to that, my blasting 500 NPCs into every chunk someone crosses into. The Perfect Storm.

Re: Linux @ 100% of 1 core

Posted: Wed Apr 30, 2014 7:39 pm
by John Adams
I was in the world for over an hour, using the new .rift command, loading NPCs and running from Halgarad through to Veskal's Exchange, where I stopped to post something unrelated. While I was standing still in game, another player logged in (likely Southwatch) and I believe they started moving through the world on foot. Once again, this New Zone ( ) spam started.

I clicked "Pause" in VS2012, and stepped over a few C++ functions, noticing they were all in "deque" and something called "xmemory0". No idea what those are, but it's the first time I caught the server going crazy before it ate itself alive.