[Home] [Downloads] [Search] [Help/forum]


Register forum user name Search FAQ

Gammon Forum

[Folder]  Entire forum
-> [Folder]  ROM
. -> [Folder]  Running the server
. . -> [Subject]  Epic Failure With Copyover (Linux)

Epic Failure With Copyover (Linux)

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page


Pages: 1 2  

Posted by YoshoFyre   (29 posts)  [Biography] bio
Date Mon 02 Aug 2010 06:49 AM (UTC)
Message
so every time i attempt to hotreboot the mud, when it restarts i get the following error after trying to reconnect the clients.

Mon Aug 2 02:34:48 2010 :: MUD is ready to rock on ports 6666 and 8080.
Mon Aug 2 02:35:17 2010 :: Sock.sinaddr: 192.168.1.210
Mon Aug 2 02:35:23 2010 :: Loading Character.
Mon Aug 2 02:35:26 2010 :: Character@192.168.1.210 has connected.
Init socket: bind: Address already in use

this causes the mud to crash out...

any suggestions?
[Go to top] top

Posted by Nick Gammon   Australia  (23,017 posts)  [Biography] bio   Forum Administrator
Date Reply #1 on Mon 02 Aug 2010 07:47 AM (UTC)
Message
It looks like the copyover (hotboot) processing didn't release the connection (the mud "binds" to the port) so afterwards it found the port in use (probably by itself).

It's hard to help more without knowing what version of ROM you are using, where you got the copyover from, and so on.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by YoshoFyre   (29 posts)  [Biography] bio
Date Reply #2 on Mon 02 Aug 2010 11:23 AM (UTC)
Message
it started as Rom2.4b ... but its been heavily modified...

im also unsure of the copyover file... i originally modded the mud back in 2001 under cgywin. i have recently reloaded the code from backup and now am running under linux. when the code was running under cygwin, the mud would hotreboot once... and crash the second time... i never really had a chance to debug it more past then.

here is the comment from the source file for copyover... if that helps at all


/*

You need to define:

COPYOVER_FILE - temporary data file used
EXE_FILE - file to be exec'ed (i.e. the MUD)


Note that I advance level 1 chars to level 2 - this is necessary in MERC and
Envy, but I think that ROM saves level 1 characters too.

Note that you might want to change your close_socket() a bit. I have changed
the connected state so that negative states represent logging-in, while as
positive ones represent states where the character is already inside the game.
close_socket() frees that chararacters with negative state, but just loses
link on those with a positive state. I believe that idea comes from Elwyn
originally.

Things to note: This corresponds to a reboot, followed by the characters
logging in again. This means that stuff like corpses, disarmed weapons etc.
are lost, unless you save those to the pfile. You should probably give the
players some warning before doing a copyover.

The command was inspired by the discussion on merc-l about how Fusion's MUD++
could reboot without players having to re-login :)

*/

i noticed that it suggests to change close_socket() but it doesn't go in to detail as to what i need to do...

thanks
[Go to top] top

Posted by Nick Gammon   Australia  (23,017 posts)  [Biography] bio   Forum Administrator
Date Reply #3 on Mon 02 Aug 2010 08:53 PM (UTC)
Message
The inability to bind refers to the single "listening" socket - the one that accepts connections (eg. to port 4000). The other sockets are the ones established per-player.

At this stage I would have to guess that the copyover does not close the listening socket, thus later when it tries to re-open it, it is still in use. It doesn't hurt to close this socket as you don't send player data down it, it simply gets new connection notifications.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by YoshoFyre   (29 posts)  [Biography] bio
Date Reply #4 on Tue 03 Aug 2010 12:04 AM (UTC)

Amended on Tue 03 Aug 2010 12:05 AM (UTC) by YoshoFyre

Message
Nick Gammon said:

The inability to bind refers to the single "listening" socket - the one that accepts connections (eg. to port 4000). The other sockets are the ones established per-player.

At this stage I would have to guess that the copyover does not close the listening socket, thus later when it tries to re-open it, it is still in use. It doesn't hurt to close this socket as you don't send player data down it, it simply gets new connection notifications.
so where would i put that... in do_copyover after the execl ? or in copyover_recover near the beginning???

I know socket_close(d) kills the player, but how do i kill the listening port?
[Go to top] top

Posted by Nick Gammon   Australia  (23,017 posts)  [Biography] bio   Forum Administrator
Date Reply #5 on Tue 03 Aug 2010 03:04 AM (UTC)
Message
It's the control port. In Smaug it looks like this:


   boot_db( fCopyOver );
   log_string( "Initializing socket" );
   if( !fCopyOver )  /* We have already the port if copyover'ed */
      control = init_socket( port );


Note how the control socket is not initialized on a copyover.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by YoshoFyre   (29 posts)  [Biography] bio
Date Reply #6 on Tue 03 Aug 2010 03:21 AM (UTC)
Message
Nick Gammon said:

It's the control port. In Smaug it looks like this:


   boot_db( fCopyOver );
   log_string( "Initializing socket" );
   if( !fCopyOver )  /* We have already the port if copyover'ed */
      control = init_socket( port );


Note how the control socket is not initialized on a copyover.

Thanks a bunch!! that worked...

but now i have new problem... after the copyover, i can't get anymore incomming connections. everyone who was connected remains, but when i try to bring a new client in, i get "connected" but it just sits there and does nothing...
[Go to top] top

Posted by Nick Gammon   Australia  (23,017 posts)  [Biography] bio   Forum Administrator
Date Reply #7 on Tue 03 Aug 2010 05:12 AM (UTC)

Amended on Tue 03 Aug 2010 05:13 AM (UTC) by Nick Gammon

Message
That happened to me a while back, and it is quite frustrating to fix. Certainly it will be hard without seeing the code or having access to debugging it.

Basically there is code in the comm.c along these lines:


   /*
    * Poll all active descriptors.
    */
   FD_ZERO( &in_set );
   FD_ZERO( &out_set );
   FD_ZERO( &exc_set );
   FD_SET( ctrl, &in_set );
   maxdesc = ctrl;
   newdesc = 0;

  // add in active descriptors to the sets here

   if( select( maxdesc + 1, &in_set, &out_set, &exc_set, &null_time ) < 0 )
   {
      perror( "accept_new: select: poll" );
      exit( 1 );
   }


Now if the "sets" are not set up right, or you pass the wrong number to "select" then it doesn't realize there is activity on that particular descriptor. I would check (perhaps in gdb) that these are being set up right.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by YoshoFyre   (29 posts)  [Biography] bio
Date Reply #8 on Tue 03 Aug 2010 06:31 AM (UTC)
Message
here is mine from comm.c

/*
	 * Poll all active descriptors.
	 */
	FD_ZERO( &in_set  );
	FD_ZERO( &out_set );
	FD_ZERO( &exc_set );
	FD_SET( control, &in_set );
	maxdesc	= control;
	for ( d = descriptor_list; d; d = d->next )
	{
	    maxdesc = UMAX( maxdesc, d->descriptor );
	    FD_SET( d->descriptor, &in_set  );
	    FD_SET( d->descriptor, &out_set );
	    FD_SET( d->descriptor, &exc_set );
	}

	if ( select( maxdesc+1, &in_set, &out_set, &exc_set, &null_time ) < 0 )
	{
	    perror( "Game_loop: select: poll" );
	    exit( 1 );
	}

	/*
	 * New connection?
	 */
	if ( FD_ISSET( control, &in_set ) )
	    init_descriptor( control );
[Go to top] top

Posted by Nick Gammon   Australia  (23,017 posts)  [Biography] bio   Forum Administrator
Date Reply #9 on Tue 03 Aug 2010 09:18 PM (UTC)
Message
All I can suggest at this point is to fire up gdb and try to work out what is happening (particularly to the control port) during a copyover.

http://www.gammon.com.au/gdb

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by YoshoFyre   (29 posts)  [Biography] bio
Date Reply #10 on Wed 04 Aug 2010 12:10 AM (UTC)

Amended on Wed 04 Aug 2010 12:25 AM (UTC) by YoshoFyre

Message
ok i have gdb installed, and i see how to use it when a program crashes out but i don't see how to get info from it when the program is still running and the program errors but doesn't crash...

how would i examine the control port?

If it helps any... here is pastebin with my copyover function and copyover recover function

http://pastebin.com/raw.php?i=5zWu5Pu9
[Go to top] top

Posted by Nick Gammon   Australia  (23,017 posts)  [Biography] bio   Forum Administrator
Date Reply #11 on Wed 04 Aug 2010 01:22 AM (UTC)
Message
Speedline02VA said:

ok i have gdb installed, and i see how to use it when a program crashes out but i don't see how to get info from it when the program is still running and the program errors but doesn't crash...


You start up under gdb. Something like this:


cd ../area
gdb ../src/rom

# set breakpoints now

run 4000    # start up - argument (port) is 4000


- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by YoshoFyre   (29 posts)  [Biography] bio
Date Reply #12 on Wed 04 Aug 2010 02:17 AM (UTC)

Amended on Wed 04 Aug 2010 02:23 AM (UTC) by YoshoFyre

Message
Nick Gammon said:

Speedline02VA said:

ok i have gdb installed, and i see how to use it when a program crashes out but i don't see how to get info from it when the program is still running and the program errors but doesn't crash...


You start up under gdb. Something like this:


cd ../area
gdb ../src/rom

# set breakpoints now

run 4000    # start up - argument (port) is 4000



I gotcha. i am that far, what i can't figure out is where to set the break points... i guess i should set points for all lines between the socket connection to the welcome screen... since i hang up somewhere between there?

upon further review, if i do a copyover, and then attempt to type anything in my ssh window while the mud is running, it crashes out with

New_descriptor: accept: Socket operation on non-socket
web-accept: Socket operation on non-socket
[Go to top] top

Posted by Nick Gammon   Australia  (23,017 posts)  [Biography] bio   Forum Administrator
Date Reply #13 on Wed 04 Aug 2010 05:26 AM (UTC)
Message
Back to basics, first. Is this Cygwin, or Linux? Or something else?

I wouldn't set a breakpoint "for all lines". Breakpoints stop execution at some point you consider important. After the breakpoint is hit (if it is hit) then you can step forwards, or examine variables, a line at a time, to confirm some hypothesis you might have.

Like, on the "select" line above, you might put a breakpoint and print what the value of maxdesc is.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by YoshoFyre   (29 posts)  [Biography] bio
Date Reply #14 on Thu 05 Aug 2010 12:18 AM (UTC)

Amended on Thu 05 Aug 2010 12:23 AM (UTC) by YoshoFyre

Message
Nick Gammon said:

Back to basics, first. Is this Cygwin, or Linux? Or something else?

I wouldn't set a breakpoint "for all lines". Breakpoints stop execution at some point you consider important. After the breakpoint is hit (if it is hit) then you can step forwards, or examine variables, a line at a time, to confirm some hypothesis you might have.

Like, on the "select" line above, you might put a breakpoint and print what the value of maxdesc is.
I'm running Ubuntu linux

Linux mud 2.6.22-14-server #1 SMP Sun Oct 14 23:34:23 GMT 2007 i686

Distributor ID: Ubuntu
Description:    Ubuntu 7.10
Release:        7.10
Codename:       gutsy

GCC version 3.4



so here is the connection part of my comm.c

/* Are we recovering from a Hot Reboot? */
if (argv[3] && argv[3][0])
{
  fCopyOver = TRUE;
  //control = atoi(argv[4]);
  if( !fCopyOver )  /* We have already the port if copyover'ed */
  {
    control = init_socket( argv[4] );
    wwwcontrol = init_socket( wwwport );
  }
}
else
  fCopyOver = FALSE;
if (!fCopyOver)
{
  control = init_socket( port );
  wwwcontrol = init_socket( wwwport );
}
boot_db();
if (!fCopyOver)
  init_web(7001);
sprintf( log_buf, "MUD is ready to rock on ports %d and %d.", port, wwwport );
log_string( log_buf );   
if (fCopyOver)
  copyover_recover();
game_loop_unix( control, wwwcontrol );
shutdown_web();
close (wwwcontrol);
//if(!fCopyOver)
  //close (control);


all the commented out stuff i did many years ago...

the code in bold i just modded per earlier

Edit: it appears that when i searched for the lines of code i modded with if (!fcopyover) i did not notice that i have that... so something must be wrong with control = init_socket( argv[4] ); to make the mud crash under copyover...

perhaps i should edit it back, and then set the break point to that line? thoughts?
[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


58,412 views.

This is page 1, subject is 2 pages long: 1 2  [Next page]

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

Quick links: MUSHclient. MUSHclient help. Forum shortcuts. Posting templates. Lua modules. Lua documentation.

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.

[Home]


Written by Nick Gammon - 5K   profile for Nick Gammon on Stack Exchange, a network of free, community-driven Q&A sites   Marriage equality

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( https://gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Hosted at HostDash]