doubts about the sockets 2

This is the forum for miscellaneous technical/programming questions.

Moderator: 2ffat

doubts about the sockets 2

Postby mark_c » Wed Aug 29, 2018 2:57 am

Hello,
I had already expressed my doubts about this issue but now I wanted to understand a little more what windows does with sockets.
The code initially negotiates the size of the incoming package but, as far as I could see, it seems that Size's negotiated dimensions are just a suggestion for the size of Buf since, it is not true then that for a 1000 byte negotiated buffer it is then filled with 1000 bytes; for example, sometimes for a Size = 1460 bytes if you ask for a strlen (Buf) this has dimensions of 4 bytes.

How is it possible?

Another question is because sometimes the while is executed only once and some times 2 but also 3 but the dimensions Size always remain the same

Code: Select all

void __fastcall TForm1::ClientSocket1Read(TObject *Sender,
TCustomWinSocket *Socket)
{
   if( !ClientSocket1->Active ) return;

   String MyBuffer;
   int executions_time=0;
   int maximum_executions_time=0;

   int ByteRecived;
   int Size = Socket->ReceiveLength();
   char *Buf = new char[Size];

   while( ByteRecived = ClientSocket1->Socket->ReceiveBuf(Buf, Size) > 0 )
   {
      MyBuffer+=Buf;

      Label2->Caption=MyBuffer.Length();
      Label2->Update();

      Label3->Caption=executions_time++;
      Label3->Update();

      if(executions_time > maximum_executions_time) maximum_executions_time=executions_time;

      Label4->Caption=maximum_executions_time;
      Label4->Update();

      fprintf(rp,"%d,%d,%d,%d,%d,%s\n", Size, ByteRecived,strlen(Buf),executions_time,maximum_executions_time,Buf);
   }

   
   Label1->Caption=Size;
   Label1->Update();
}
Last edited by mark_c on Wed Aug 29, 2018 12:19 pm, edited 1 time in total.
mark_c
BCBJ Guru
BCBJ Guru
 
Posts: 129
Joined: Thu Jun 21, 2012 1:13 am

Re: doubts about the sockets 2

Postby rlebeau » Wed Aug 29, 2018 11:29 am

mark_c wrote:The code initially negotiates the size of the incoming package


And how is it doing that exactly? I'm assuming you are referring to the socket's SO_RCVBUF option? That is just an in-memory buffer, it has nothing to do with how many bytes the sender actually sends.

mark_c wrote:but, as far as I could see, it seems that Size's negotiated dimensions are just a suggestion for the size of Buf since, it is not true then that for a 1000 byte negotiated buffer it is then filled with 1000 bytes;


Just because you ask the socket to use a 1000-byte buffer doesn't mean the sender is actually sending 1000 bytes. The OnRead event is triggered when there is at least 1 byte available in the socket's internal buffer. ReceiveLength() returns the actual number of unread bytes available in that buffer. That is the maximum number of bytes that ReceiveBuf() is guaranteed to return without needing to block the calling thread (unless another thread reads the bytes first, but for simplicity lets assume that is not the case here). By the time you call ReceiveBuf(), more bytes may have arrived, which will sit in the socket's internal buffer until you read them, which may be delayed until a later OnRead event.

mark_c wrote:for example, sometimes for a Size = 1460 bytes if you ask for a strlen (Buf) this has dimensions of 4 bytes.

How is it possible?


Because strlen() requires a null-terminated string, but socket data is not null-terminated, and you are not injecting your own terminator at the end of the data before calling strlen() (same when appending Buf to MyBuffer). Also, binary data may contain null bytes as well, further confusing strlen().

In short, don't use strlen() (or otherwise rely on a null terminator) when dealing with socket data. There is a reason why ReceiveBuf() returns the actual number of bytes read. Use that value instead.

mark_c wrote:Another question is because sometimes the while is executed only once and some times 2 but also 3 but the Dimentious Size always remain the same


Because your reading loop is not updating the Size on each iteration. You are just blindly reading until there is nothing left to read. You need to either:

- decrement Size by the number of bytes actually read.

- call ReceiveSize() again.

You code should look something more like this:

Code: Select all
void __fastcall TForm1::ClientSocket1Read(TObject *Sender, TCustomWinSocket *Socket)
{
    AnsiString MyBuffer;
    int executions_time = 0;
    int maximum_executions_time = 0;
    int ByteRecived;

    int Size = Socket->ReceiveLength();
    if (Size <= 0) return;

    char *Buf = new char[Size];

    do
    {
        ByteRecived = Socket->ReceiveBuf(Buf, Size);
        if (ByteRecived <= 0) break;

        MyBuffer += AnsiString(Buf, ByteRecived);

        Label2->Caption = MyBuffer.Length();
   Label2->Update();

        Label3->Caption = executions_time++;
        Label3->Update();

   if (executions_time > maximum_executions_time) maximum_executions_time = executions_time;

        Label4->Caption = maximum_executions_time;
   Label4->Update();

        fprintf(rp, "%d,%d,%d,%d,%.*s\n", Size, ByteRecived, executions_time, maximum_executions_time, ByteRecived, Buf);

        Size -= ByteRecived;
    }
    while (Size > 0);

    delete[] Buf;

    Label1->Caption = Size;
    Label1->Update();
}


Or:

Code: Select all
void __fastcall TForm1::ClientSocket1Read(TObject *Sender, TCustomWinSocket *Socket)
{
    AnsiString MyBuffer;
    int executions_time = 0;
    int maximum_executions_time = 0;
    int ByteRecived;

    int Size = Socket->ReceiveLength();
    if (Size <= 0) return;

    char Buf[1024];

    do
    {
        ByteRecived = Socket->ReceiveBuf(Buf, min(Size, sizeof(Buf)));
        if (ByteRecived <= 0) break;

        MyBuffer += AnsiString(Buf, ByteRecived);

        Label2->Caption = MyBuffer.Length();
   Label2->Update();

        Label3->Caption = executions_time++;
        Label3->Update();

   if (executions_time > maximum_executions_time) maximum_executions_time = executions_time;

        Label4->Caption = maximum_executions_time;
   Label4->Update();

        fprintf(rp, "%d,%d,%d,%d,%.*s\n", Size, ByteRecived, executions_time, maximum_executions_time, ByteRecived, Buf);

        Size -= ByteRecived;
    }
    while (Size > 0);

    Label1->Caption = Size;
    Label1->Update();
}


Either way, this kind of coding is fine for debugging purposes, but this is not a very good way to handle socket data in general. Socket protocols have structure to them, you should be writing your reading code to handle that structure properly.
Remy Lebeau (TeamB)
Lebeau Software
User avatar
rlebeau
BCBJ Author
BCBJ Author
 
Posts: 1533
Joined: Wed Jun 01, 2005 3:21 am
Location: California, USA

Re: doubts about the sockets 2

Postby mark_c » Wed Aug 29, 2018 12:21 pm

sorry Remy forgetfulness,
why ByteRecived in the loop is always equal to 1 and not to the number of bytes read?
mark_c
BCBJ Guru
BCBJ Guru
 
Posts: 129
Joined: Thu Jun 21, 2012 1:13 am

Re: doubts about the sockets 2

Postby rlebeau » Wed Aug 29, 2018 12:43 pm

mark_c wrote:why ByteRecived in the loop is always equal to 1 and not to the number of bytes read?


The only way that can happen is if ReceiveBuf() is actually reading only 1 byte at a time.

Update: scratch that. There is another way it could be 1. See further below for an explanation.
Last edited by rlebeau on Thu Aug 30, 2018 11:29 am, edited 1 time in total.
Remy Lebeau (TeamB)
Lebeau Software
User avatar
rlebeau
BCBJ Author
BCBJ Author
 
Posts: 1533
Joined: Wed Jun 01, 2005 3:21 am
Location: California, USA

Re: doubts about the sockets 2

Postby mark_c » Wed Aug 29, 2018 1:24 pm

sorry Remy another doubt: it is possible that during the loop
Code: Select all
while (ByteRecived = ClientSocket1-> Socket-> ReceiveBuf (Buf, Size)> 0)

and before it has been completed, is it abandoned to handle a new ClientSocket1Read event?
I think that doing so would lose data.

During the execution of the while, any other data generated by the server, who manages them in order not to lose them, maybe a particular buffer of the Windows OS?
mark_c
BCBJ Guru
BCBJ Guru
 
Posts: 129
Joined: Thu Jun 21, 2012 1:13 am

Re: doubts about the sockets 2

Postby rlebeau » Thu Aug 30, 2018 11:27 am

mark_c wrote:it is possible that during the loop

Code: Select all
while (ByteRecived = ClientSocket1-> Socket-> ReceiveBuf (Buf, Size)> 0)


and before it has been completed, is it abandoned to handle a new ClientSocket1Read event?


No.

More importantly, that line of code is written incorrectly, which is why ByteRecived is always 1. Per C++ Operator Precedence, the '>' operator has higher precedence than the '=' operator, so that line is interpreted by the compiler as if you had written it like this:

Code: Select all
while (ByteRecived = (ClientSocket1-> Socket-> ReceiveBuf (Buf, Size) > 0))


Which is not what you want. When ReceiveBuf() returns a value > 0, ByteRecived is being assigned a boolean 'true', which is implicitly converted to an integer 1.

You need to write the line like this instead:

Code: Select all
while ((ByteRecived = ClientSocket1-> Socket-> ReceiveBuf (Buf, Size)) > 0)


That way the assignment occurs before the comparison, not the other way around.

This is a good reason why you should never write code that performs assignments and comparisons in the same expression.
Remy Lebeau (TeamB)
Lebeau Software
User avatar
rlebeau
BCBJ Author
BCBJ Author
 
Posts: 1533
Joined: Wed Jun 01, 2005 3:21 am
Location: California, USA

Re: doubts about the sockets 2

Postby mark_c » Fri Aug 31, 2018 4:47 am

thanks Remy.

How come the two versions of the same code below produce different results on the file?
For example, if I read binary files from sockets, I have differences.

Code: Select all
void __fastcall TForm1::ClientSocket1Read(TObject *Sender,
TCustomWinSocket *Socket)
{
   if( !ClientSocket1->Active ) return;

   String MyBuffer;

   int ByteRecived;
   
   int Size = Socket->ReceiveLength();
   
   if(Size <=0) return;
   
   char *Buf = new char[Size];

   while( (ByteRecived = ClientSocket1->Socket->ReceiveBuf(Buf, Size)) > 0 )
   {
      MyBuffer+=Buf;
   }
   
   for(int i=1; i<MyBuffer.Length(); i++)
   fprintf(fp,"%c",MyBuffer[i]);
   
   MyBuffer="";
   delete [] Buf;
}





void __fastcall TForm1::ClientSocket1Read(TObject *Sender,
TCustomWinSocket *Socket)
{
   if( !ClientSocket1->Active ) return;


   int ByteRecived;
   
   int Size = Socket->ReceiveLength();
   
   if(Size <=0) return;
   
   char *Buf = new char[Size];

   while( (ByteRecived = ClientSocket1->Socket->ReceiveBuf(Buf, Size)) > 0 )
   {
      for(int i=1; i<ByteReceived; i++)
         fprintf(fp,"%c",Buf[i]);
   }
   
      delete [] Buf;
}
mark_c
BCBJ Guru
BCBJ Guru
 
Posts: 129
Joined: Thu Jun 21, 2012 1:13 am

Re: doubts about the sockets 2

Postby rlebeau » Fri Aug 31, 2018 10:31 am

mark_c wrote:How come the two versions of the same code below produce different results on the file?
For example, if I read binary files from sockets, I have differences.


You clearly have not been paying attention, because your code examples still have several errors that I have already addressed in earlier replies. Please go back and re-read the history of this discussion thread again more carefully:

- You never said which version of C++Builder you are using. If CB2009+, String is UnicodeString, not AnsiString. Notice in my examples that I changed String to AnsiString. This is important. Otherwise, you should use a more suitable byte container instead, such as std::vector<Byte>.

- You are ignoring ByteRecived when appending Buf to MyBuffer. You are assuming Buf is null-terminated, but it is not. You must take ByteRecived into account when appending Buf to MyBuffer.

- You are ignoring Size while looping, so you loop more than you should be.

As for why your two examples are producing different file outputs - it is because they both have errors in their respective logics.

In the first example, you are potentially writing garbage bytes to MyBuffer due to mishandling of null terminators. But more importantly, the 'for' loop that saves MyBuffer to file is skipping the last byte saved in MyBuffer, as it is using '<' instead of '<='. Since it is looping through a 1-indexed string, '<' will ignore the last index.

In the second example, the 'for' loop is skipping the first byte received in Buf, because it is starting at index 1 instead of 0. Buf is a 0-indexed array, not a 1-indexed string.

In both cases, it would be a lot easier to just call fwrite() one time with the entire data instead of calling fprintf() in a loop:

Code: Select all
void __fastcall TForm1::ClientSocket1Read(TObject *Sender, TCustomWinSocket *Socket)
{
   AnsiString MyBuffer;
   int ByteRecived;
   
   int Size = Socket->ReceiveLength();
   if (Size <= 0) return;
   
   char *Buf = new char[Size];

   do
   {
      if ((ByteRecived = ClientSocket1->Socket->ReceiveBuf(Buf, Size)) <= 0) break;
      MyBuffer += AnsiString(Buf, ByteRecived);
      Size -= ByteRecived;
   }   
   while (Size > 0);

   /*
   for(int i = 1; i <= MyBuffer.Length(); i++)
      fprintf(fp,"%c", MyBuffer[i]);
   */
   fwrite(MyBuffer.c_str(), 1, MyBuffer.Length(), fp);

   delete [] Buf;
}


Code: Select all
void __fastcall TForm1::ClientSocket1Read(TObject *Sender, TCustomWinSocket *Socket)
{
   int ByteRecived;
   
   int Size = Socket->ReceiveLength();
   if (Size <= 0) return;
   
   char *Buf = new char[Size];

   do
   {
      if ((ByteRecived = ClientSocket1->Socket->ReceiveBuf(Buf, Size)) <= 0) break;
      /*
      for(int i = 0; i < ByteRecived; i++)
         fprintf(fp, "%c", Buf[i]);
      */
      fwrite(Buf, 1, ByteRecived, fp);
      Size -= ByteRecived;
   }
   while (Size > 0);
   
   delete [] Buf;
}


Alternatively, I would suggest some further tweaks to both examples to make more effective use of Buf:

Code: Select all
void __fastcall TForm1::ClientSocket1Read(TObject *Sender, TCustomWinSocket *Socket)
{
   int ByteRecived;
   
   int SizeToRead = Socket->ReceiveLength();
   if (SizeToRead <= 0) return;
   
   char *Buf = new char[SizeToRead];

   char *pBuf = Buf;
   int SizeInBuf = 0;

   do
   {
      if ((ByteRecived = ClientSocket1->Socket->ReceiveBuf(pBuf, SizeToRead)) <= 0) break;
      SizeInBuf += ByteRecived;
      SizeToRead -= ByteRecived;
   }   
   while (SizeToRead > 0);

   /*
   for(int i = 0; i < SizeInBuf; i++)
      fprintf(fp,"%c", Buf[i]);
   */
   fwrite(Buf, 1, SizeInBuf, fp);

   delete [] Buf;
}


Code: Select all
void __fastcall TForm1::ClientSocket1Read(TObject *Sender, TCustomWinSocket *Socket)
{
   int ByteRecived;
   
   int SizeToRead = Socket->ReceiveLength();
   if (SizeToRead <= 0) return;
   
   int BufSize = min(SizeToRead, 1024);
   char *Buf = new char[BufSize];

   do
   {
      if ((ByteRecived = ClientSocket1->Socket->ReceiveBuf(Buf, min(SizeToRead, BufSize))) <= 0) break;
      /*
      for(int i = 0; i < ByteRecived; i++)
         fprintf(fp, "%c", Buf[i]);
      */
      fwrite(Buf, 1, ByteRecived, fp);
      SizeToRead -= ByteRecived;
   }
   while (SizeToRead > 0);
   
   delete [] Buf;
}


Now, with that said, why are you using C style file I/O at all, instead of using C++ style file I/O from the C++ standard library (std::ofstream) or C++Builder's RTL (TFileStream, TStreamWriter, etc)?

Lastly, there is simply no structure to your communications at all. The kind of code you have written so far will only work if the server is dumping a file as-is to the connection as soon as the client connects to the server, and then the server disconnects the client after sending the file. But TCP protocols are rarely ever that simple. They have structure to their messages (how does a message start? How does it end? What kind of data fields does a message contain? Are they fixed-length or variable-length? How many bytes are in each field? In what formats?). Commands and responses. Etc. You don't have ANY of that in your code yet. But you need that if you want to be effective at network programming.

For example, when sending a file, you should send the file's total size as a fixed-sized integer (preferably as a 64bit integer, unless 32bit suits your needs) in network byte order, and optionally also its name as a length-prefixed UTF-8 string, before sending the actual file data. That way, the receiver can know how many bytes to expect up front so it knows when to stop writing bytes to its local file, and optionally knows what to name its local file.

There is a lot of things you have to handle when doing network programming. And you are having to do all of them manually since you are coding (almost) directly to the TCP socket, instead of using a higher-level wrapper that handles these kind of details for you. TClientSocket is very bare-bones, consider switching to a better socket component instead, such as TIdTCPClient in Indy, which has many methods available for reading/sending integers, strings, streams, etc.
Remy Lebeau (TeamB)
Lebeau Software
User avatar
rlebeau
BCBJ Author
BCBJ Author
 
Posts: 1533
Joined: Wed Jun 01, 2005 3:21 am
Location: California, USA

Re: doubts about the sockets 2

Postby mark_c » Fri Aug 31, 2018 10:57 am

thanks Remy,
in reality I'm using

Code: Select all
for (int i = 1; i <ByteReceived; i ++)
          fprintf (fp, "%c", Buf [i]);


because I have to save the buffer only if it starts with FFD8 and ends with FFD9; that's why I use fprintf
mark_c
BCBJ Guru
BCBJ Guru
 
Posts: 129
Joined: Thu Jun 21, 2012 1:13 am

Re: doubts about the sockets 2

Postby rlebeau » Fri Aug 31, 2018 12:35 pm

mark_c wrote:thanks Remy,
in reality I'm using

Code: Select all
for (int i = 1; i <ByteReceived; i ++)
   fprintf (fp, "%c", Buf [i]);


because I have to save the buffer only if it starts with FFD8 and ends with FFD9; that's why I use fprintf


That doesn't change what I said earlier. And besides, that is not a good reason to use fprintf() anyway. Especially since you are still using it wrong.

For starters, FFDB and FFD9 are not single-byte values. Since you are using text formatting, the real bytes are more likely to be simply 0xD8 and 0xD9 and are being sign-extended to 16bit values because the high bit of both 0xD8 and 0xD9 is 1, and 'char' may be either signed or unsigned depending on the compiler (in C++Builder, 'char' is signed by default - there is a compiler option to change that).

But, you really should NOT be using fprintf() to write single characters to begin with, use fwrite() instead:

Code: Select all
for (int i = 1; i < ByteReceived; i ++)
   fwrite(&Buf[i], 1, 1, fp);


Though it would be better to use a single fwrite() call instead and simply give it a range of bytes to write, eg:

Code: Select all
fwrite(Buf+1, 1, ByteReceived-2, fp);


That being said, you are not handling 0xDB and 0xD9 at all in the code you have shown so far. TCP is a stream, there is no 1:1 relationship between sends and reads. A single message can straddle across multiple OnRead events, and a single OnRead event can receive multiple messages at a time. So you have to handle that possibility explicitly in your code, by using a persistent buffer for all incoming data and search that buffer for messages, eg:

Code: Select all
void __fastcall TForm1::ClientSocket1Connect(TObject *Sender, TCustomWinSocket *Socket)
{
   // buffer for incoming data...
   Socket->Data = new AnsiString;
}

void __fastcall TForm1::ClientSocket1Disconnect(TObject *Sender, TCustomWinSocket *Socket)
{
   delete static_cast<AnsiString*>(Socket->Data);
   Socket->Data = NULL;
}

void __fastcall TForm1::ClientSocket1Read(TObject *Sender, TCustomWinSocket *Socket)
{
   char Buf[256];
   int ByteRecived = ClientSocket1->Socket->ReceiveBuf(Buf, sizeof(Buf));
   if (ByteRecived <= 0) return;

   // append received bytes to end of buffer...
   AnsiString *MyBuffer = static_cast<AnsiString*>(Socket->Data);
   *MyBuffer += AnsiString(Buf, ByteRecived);

   // scan buffer for complete messages...

   // System::Pos(), PosEx(), AnsiString::Pos(), they are all inefficient for searching for single characters
   // in a loop, so using memchr() to scan the raw AnsiString memory directly...

   const char *pBuffer = MyBuffer->c_str();
   int BufferSize = MyBuffer->Length();
   int SizeProcessed = 0;

   do
   {
      // scan for start byte...
      char *pMsgStart = (char*) memchr(pBuffer, 0xD8, BufferSize);
      if (!pMsgStart)
      {
         // not found, wipe the buffer for next time...
         SizeProcessed += Size;
         break;
      }

      if (pMsgStart > pBuffer)
      {
         // not the first byte, ignore everything before it...
         int SizeToIgnore = (pMsgStart - pBuffer);
         pBuffer = pMsgStart;
         BufferSize -= SizeToIgnore;
         SizeProcessed += SizeToIgnore;
      }

      // scan for last byte...
      char *pMsgEnd = (char*) memchr(pMsgStart+1, 0xD9, BufferSize-1);
      if (!pMsgEnd)
      {
         // not found, keep the current message in the buffer for next time...
         break;
      }
      ++pMsgEnd;

      // total size between start and end bytes, inclusive
      int MsgSize = pMsgEnd - pMsgStart;
 
      // remove +1 and -2 if you want to save the start and end bytes...
      /*
      for (int i = 1; i < MsgSize-2; ++i)
          fwrite(pStart+i, 1, 1, fp);
      */
      fwrite(pStart+1, 1, MsgSize-2, fp);

      // done with current message...
      SizeProcessed += MsgSize;
      pBuffer = pMsgEnd;
      BufferSize -= MsgSize;
   }
   while (BufferSize > 0);

   if (SizeProcessed > 0)
      MyBuffer->Delete(1, SizeProcessed);
}
Remy Lebeau (TeamB)
Lebeau Software
User avatar
rlebeau
BCBJ Author
BCBJ Author
 
Posts: 1533
Joined: Wed Jun 01, 2005 3:21 am
Location: California, USA

Re: doubts about the sockets 2

Postby mark_c » Sat Sep 01, 2018 9:26 am

thanks Remy
mark_c
BCBJ Guru
BCBJ Guru
 
Posts: 129
Joined: Thu Jun 21, 2012 1:13 am


Return to Technical

Who is online

Users browsing this forum: Bing [Bot] and 15 guests

cron