Page 2 of 2

Re: Linux @ 100% of 1 core

Posted: Thu May 01, 2014 4:20 am
by theFoof
I'll take a look - what I could see happening somewhere is if we have a deque that's being iterated while(!empty()), on a very highly used list it could cause problems. BUT I think all of this lists I switched these over were wrapped in a mutex so not sure that would really be an issue. They should have a limited size.

That might have just been because that's where all packets are sent/received into. So it would be very highly used. Plus I think we use 2 threads per UDP server so you would see a lot of those functions at once, but it being totally normal.

EDIT: Yeah not seeing anything that jumps out at me with any deques.

Re: Linux @ 100% of 1 core

Posted: Tue May 06, 2014 7:53 am
by John Adams
Another version of the same, where it crashed is likely just a victim of the out-of-memory.

FYI: New Telon is configured right now to only load 5 random spawns, so this is not related to NPC loading.

Stack

Code: Select all

 	KernelBase.dll!76f0c41f()	Unknown
 	[Frames below may be incorrect and/or missing, no symbols loaded for KernelBase.dll]	
 	WorldServer.exe!_CxxThrowException(void * pExceptionObject, const _s__ThrowInfo * pThrowInfo) Line 152	C++
 	WorldServer.exe!operator new(unsigned int size) Line 63	C++
>	WorldServer.exe!Net::SendClientChunkInfo(Client & client, int coord_x, int coord_y) Line 478	C++
 	WorldServer.exe!ChunkServer::HandleZoneRequest(Client & client, PacketStruct & packet_struct) Line 756	C++
 	WorldServer.exe!ChunkServer::Process() Line 188	C++
 	WorldServer.exe!ChunkServerList::Process() Line 172	C++
 	WorldServer.exe!main(int argc, char * * argv) Line 119	C++
 	WorldServer.exe!__tmainCRTStartup() Line 241	C
 	WorldServer.exe!mainCRTStartup() Line 164	C
 	kernel32.dll!7669338a()	Unknown
 	ntdll.dll!77709f72()	Unknown
 	WorldServer.exe!_wstat64(const wchar_t * name, _stat64 * buf) Line 243	C
Console
[quote]22:28:17.521 I Chunk Received zone request!
22:28:17.521 I Chunk New zone (-28, 24)
22:28:17.521 I Net Couldn't find client by account id in Net::SendClientChu
nkInfo
22:28:17.583 I Chunk Received zone request!
22:28:17.583 I Chunk New zone (-28, 24)
22:28:17.583 I Net Couldn't find client by account id in Net::SendClientChu
nkInfo
Out of memory trying to allocate log buffer (level=0, file='c:\vgoemu\src\common
\soeprotocoldata.cpp', function='SOEProtocolData::SetData', line=144)
Out of memory trying to allocate 122 bytes for log messageOu22:28:17.661 t ofI
mChunk emorNew zone (-28, 24)
y trying to allocate 160 bytes for log messageOut 22:28:17.676 of mI emUDP
ory Client from 162.201.117.157 has been removed
trying to allocate 162 bytes for log message22:28:17.692 I UDP New client c
onnected from 162.201.117.157
22:28:17.739 I Chunk Received zone request!
22:28:17.739 I Chunk New zone (-28, 24)[/quote]

Re: Linux @ 100% of 1 core

Posted: Wed May 07, 2014 9:44 pm
by theFoof
Ok so I'm pretty sure this is originating from the server receiving either a "bad" packet, or we just aren't reading some type of packet correctly. I hooked up an app that tracks thread CPU usage and when this happened, a UDPServer reader thread is using up the CPU. The multi packet handler makes a call to the ReadHelper, which makes yet another call to the multi packet handler. Resulting in an infinite loop if the packet is read in a certain way.

I'm going to bandaid this with a timer. Not sure how to do a proper fix yet. Hard to say what that would be because I can't see the original packet that causes this. You don't know it's happening until the loop has already started.

Re: Linux @ 100% of 1 core

Posted: Wed May 07, 2014 10:57 pm
by theFoof
I did some tests and it seems like the timer fixed this, so committed. I also moved outgoing queues from UDPServer to Client, which should reduce packet lag with multiple clients a lot. Make sure you grab the latest opcodes/structs

Re: Linux @ 100% of 1 core

Posted: Thu May 08, 2014 5:04 pm
by John Adams
Thanks for your continued support, foof. This code is on New Telon now ( still no NPCs yet, soon(tm) )

Re: Linux @ 100% of 1 core

Posted: Mon May 12, 2014 3:01 pm
by John Adams
Not sure if it was me, or if a Dev checked out New Telon via RDP, but apparently the "DOS Box" was clicked inside, and halted the process I have disabled the Quick Mode. I think it was suck like that for days. Soon as I hit enter in the dialog a ton of connect/disconnect attempts flew by.

Re: Linux @ 100% of 1 core

Posted: Wed May 14, 2014 6:48 am
by John Adams
Sad to report, the zoning spam happened again this morning. First time in many connections.

Console (example):

Code: Select all

04:57:28.793 I Chunk    Received zone request!
04:57:28.793 I Chunk    New zone (-28, 24)
04:57:28.855 I Chunk    Received zone request!
04:57:28.855 I Chunk    New zone (-28, 24)
04:57:28.917 I Chunk    Received zone request!
04:57:28.917 I Chunk    New zone (-28, 24)
04:57:28.979 I Chunk    Received zone request!
04:57:28.979 I Chunk    New zone (-28, 24)
0000: 00 0D 00 6D 00 00 00 00 - 00 00 00 00 00 00 00 00  ...m............
0010: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0020: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0030: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0040: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0050: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0060: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0070: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0080: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0090: AC 53                                              .S
04:57:29.041 I Chunk    Received zone request!
04:57:29.150 I Chunk    New zone (-28, 24)
04:57:29.212 I Chunk    Received zone request!
04:57:29.212 I Chunk    New zone (-28, 24)
04:57:29.305 I Chunk    Received zone request!
04:57:29.305 I Chunk    New zone (-28, 24)
Stack:

Code: Select all

 	WorldServer.exe!std::_Container_base12::_Orphan_all() Line 216	C++
 	WorldServer.exe!std::deque<ClientSocketData *,std::allocator<ClientSocketData *> >::push_back(ClientSocketData * const & _Val) Line 1487	C++
>	WorldServer.exe!Client::PushOutgoing(SOEProtocolData * data, bool dump_packet) Line 303	C++
 	WorldServer.exe!Client::QueuePacket(SOEProtocolData * data, bool dump_packet) Line 228	C++
 	WorldServer.exe!ChunkServer::HandleZoneRequest(Client & client, PacketStruct & packet_struct) Line 739	C++
 	WorldServer.exe!ChunkServer::Process() Line 189	C++
 	WorldServer.exe!ChunkServerList::Process() Line 172	C++
 	WorldServer.exe!main(int argc, char * * argv) Line 124	C++
 	WorldServer.exe!__tmainCRTStartup() Line 241	C
 	WorldServer.exe!mainCRTStartup() Line 164	C
 	kernel32.dll!7669338a()	Unknown
 	[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]	
 	ntdll.dll!77709f72()	Unknown
 	ntdll.dll!77709f45()	Unknown
In this instance, it crashed long before running out of memory though, so a slightly different outcome.

Re: Linux @ 100% of 1 core

Posted: Wed May 14, 2014 2:04 pm
by theFoof
[quote="John Adams"]Sad to report, the zoning spam happened again this morning. First time in many connections.

Console (example):

Code: Select all

04:57:28.793 I Chunk    Received zone request!
04:57:28.793 I Chunk    New zone (-28, 24)
04:57:28.855 I Chunk    Received zone request!
04:57:28.855 I Chunk    New zone (-28, 24)
04:57:28.917 I Chunk    Received zone request!
04:57:28.917 I Chunk    New zone (-28, 24)
04:57:28.979 I Chunk    Received zone request!
04:57:28.979 I Chunk    New zone (-28, 24)
0000: 00 0D 00 6D 00 00 00 00 - 00 00 00 00 00 00 00 00  ...m............
0010: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0020: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0030: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0040: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0050: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0060: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0070: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0080: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00  ................
0090: AC 53                                              .S
04:57:29.041 I Chunk    Received zone request!
04:57:29.150 I Chunk    New zone (-28, 24)
04:57:29.212 I Chunk    Received zone request!
04:57:29.212 I Chunk    New zone (-28, 24)
04:57:29.305 I Chunk    Received zone request!
04:57:29.305 I Chunk    New zone (-28, 24)
Stack:

Code: Select all

 	WorldServer.exe!std::_Container_base12::_Orphan_all() Line 216	C++
 	WorldServer.exe!std::deque<ClientSocketData *,std::allocator<ClientSocketData *> >::push_back(ClientSocketData * const & _Val) Line 1487	C++
>	WorldServer.exe!Client::PushOutgoing(SOEProtocolData * data, bool dump_packet) Line 303	C++
 	WorldServer.exe!Client::QueuePacket(SOEProtocolData * data, bool dump_packet) Line 228	C++
 	WorldServer.exe!ChunkServer::HandleZoneRequest(Client & client, PacketStruct & packet_struct) Line 739	C++
 	WorldServer.exe!ChunkServer::Process() Line 189	C++
 	WorldServer.exe!ChunkServerList::Process() Line 172	C++
 	WorldServer.exe!main(int argc, char * * argv) Line 124	C++
 	WorldServer.exe!__tmainCRTStartup() Line 241	C
 	WorldServer.exe!mainCRTStartup() Line 164	C
 	kernel32.dll!7669338a()	Unknown
 	[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]	
 	ntdll.dll!77709f72()	Unknown
 	ntdll.dll!77709f45()	Unknown
In this instance, it crashed long before running out of memory though, so a slightly different outcome.[/quote]

Looks like from the stack the client was probably deleted before all of the packets were dealt with, because the client spams that freaking much. Going to add a cap of 1000 (That number may change. Need to make sure it doesn't alter normal gameplay.) of incoming packets from a client at one time, once that is exceeded will disconnect the client. Also going to not process packets from a client that has been disconnected.

(This is also a separate crash from the other that was bandaided.)