Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http server runs out of sockets #809

Closed
tve opened this issue Feb 23, 2016 · 44 comments
Closed

http server runs out of sockets #809

tve opened this issue Feb 23, 2016 · 44 comments
Labels
bug ESP8266 This is only a problem on ESP8266 devices

Comments

@tve
Copy link
Contributor

tve commented Feb 23, 2016

Reported by Wilberforce:

When testing and debugging, after a while to http server code fails to start with an out of sockets error.
This happens after a number of uploads from the web ide.
I believe it is due to the old http server sessions not shutting down properly.
I have my board set up so it auto joins my local wifi.
At the moment I just hit the reset button on the board, and re upload, however it would be handy to shutdown the connections cleanly. Can this be down with an E.event on after a .reset?
Wifi.disconnect and wifi.connect() does do the job, however it would be great if an event could trigger.

@tve tve added bug ESP8266 This is only a problem on ESP8266 devices labels Feb 23, 2016
@gfwilliams
Copy link
Member

Well, socketserver.c gets called from jswrap_network_kill, which is called whenever there is a call to reset(), and that should close all the sockets.

Could it be that something about the reset() causes the nonblocking ESP8266 API calls not to be made, or not to complete?

@wilberforce
Copy link
Member

Thanks.
Is there anything in esp8266.getstate() we could use to report the number of sockets open, or anything else that would be useful for determing the error?
I.e show the number of open sockets?
The next time this occurs is there something I can dump that would be of use?

@tve
Copy link
Contributor Author

tve commented Feb 23, 2016

the debug log has messages about every socket open/close, should be easy to trace what happens there

@wilberforce
Copy link
Member

@tve - do you mean the console log in the IDE?

I was doing some web socket code that involved quite a few sends to the ESP8266 from the ide:

 _____                 _

|   __|___ ___ ___ _ _|_|___ ___
|   __|_ -| . |  _| | | |   | . |
|_____|___|  _|_| |___|_|_|_|___|
          |_| http://espruino.com
 1v84 Copyright 2016 G.Williams
WARNING: the esp8266 port is in beta!
Flash map 4MB:512/512, manuf 0xe0 chip 0x4016
>echo(0);
ERROR: Unable to create socket
=undefined
Sending block "rver.on(\"websoc", wait 50ms
Sent
Sending block "ket\", function(", wait 50ms
Sent
Sending block "ws) {\u001b\n   ws_ou", wait 50ms
Sent
Sending block "ter=ws;\u001b\n    ws", wait 50ms
Sent
Sending block ".on('message',f", wait 50ms
Sent
Sending block "unction(msg) { ", wait 50ms
Sent
Sending block "print(\"[WS] \"+J", wait 50ms
Sent
Sending block "SON.stringify(m", wait 50ms
Sent
Sending block "sg)); });\u001b\n    ", wait 50ms
Sent
Sending block "ws.send(\"Hello ", wait 50ms
Sent
Sending block "from Espruino, ", wait 50ms
Sent
Sending block "module!\");\u001b\n   ", wait 50ms
Sent
Sending block "ws.send(JSON.st", wait 50ms
Sent
Sending block "ringify({data:5", wait 50ms
Sent
Sending block "55}));\u001b\n });\nec", wait 50ms
Sent
Sending block "ho(1);\n", wait 50ms
Sent
>>> Sent

This output doesn't look that helpful.

var server = require('ws').createServer(onPageRequest);
 server.listen(80);
 server.on("websocket", function(ws) {
   ws_outer=ws;
    ws.on('message',function(msg) { print("[WS] "+JSON.stringify(msg)); });
    ws.send("Hello from Espruino, module!");
   ws.send(JSON.stringify({data:555}));
 });

the create server is failing on socket create...

@tve
Copy link
Contributor Author

tve commented Feb 25, 2016

I meant the uart1 output. I will look into this issue, but I'm just slammed.

@wilberforce
Copy link
Member

@tve
I appreciate that you have taken the time to respond when you're busy!

Is there a way to connect the uart1 to the web ide console as I'm using a nodemcu board?

@wilberforce
Copy link
Member

Notes for myself !
/*JSON{
"type" : "staticmethod",
"class" : "ESP8266",
"name" : "setLog",
"generate" : "jswrap_ESP8266_setLog",
"params" : [
["mode", "JsVar", "Debug log mode: 0=off, 1=in-memory only, 2=in-mem and uart0, 3=in-mem and uart1."]

/*JSON{
"type" : "staticmethod",
"class" : "ESP8266",
"name" : "printLog",
"generate" : "jswrap_ESP8266_printLog"
}

/*JSON{
"type" : "staticmethod",
"class" : "ESP8266",
"name" : "dumpSocketInfo",
"generate" : "jswrap_ESP8266_dumpSocketInfo"
}
Dumps info about all sockets to the log. This is for troubleshooting the socket implementation.

@tve
Copy link
Contributor Author

tve commented Feb 26, 2016

mhh, net.kill() causes an exception:

  "reason": "exception",
  "exccause": 28, "epc1": 1076117927, "epc2": 0, "epc3": 0, "excvaddr": 256,
  "depc": 0 }

On the esp8266 a reset() does not close server sockets cleanly. A dumpSocketInfo shows that a server socket remains in a disconnecting state.

@tve
Copy link
Contributor Author

tve commented Feb 26, 2016

Haha, from the code:

    // FIXME: do we get a disconnected callback or is this it? If we don't get a callback we can
    // go straight to SOCKET_STATE_CLOSED
    pSocketData->state = SOCKET_STATE_DISCONNECTING;

Looks like we don't get a disconnected callback... Probably my bad for not double-checking and fixing this.

gfwilliams added a commit that referenced this issue Feb 26, 2016
fix #809 esp8266 socket close
@davidmoshal
Copy link

Problem still exists in 1.85
ERROR: Unable to create socket
Does not respond to reset()
Does respond to hardware reset
Board: NodeMCU V2

@tve tve reopened this Jun 23, 2016
@tve
Copy link
Contributor Author

tve commented Jun 23, 2016

Steps to reproduce?

@davidmoshal
Copy link

Nothing fancy, just working on a very simple server will trigger the problem after a couple of uploads.

var PORT = 8080;

wifi.connect(
  "xxx", 
  {password:"xxx"},
  function(err){
    console.log("connected? err=", err, 
                "port:", PORT, 
                "info=", wifi.getIP());
    var server = http.createServer(function (req, res) {
       res.writeHead(200);
       res.end("Hello World");
    });
    server.listen(PORT);
  }
);

@wilberforce
Copy link
Member

wilberforce commented Jun 25, 2016

This version shows the debug output in the web ide:

var PORT = 8080;
var wifi=require("Wifi");
var http=require("http");
var ESP8266=require("ESP8266");

ESP8266.setLog(2);
wifi.connect(
  "ssid", 
  {password:"pwd"},
  function(err){
    console.log("connected? err=", err, 
                "port:", PORT, 
                "info=", wifi.getIP());
    var server = http.createServer(function (req, res) {
      console.log({sckt:server.sckt});
      ESP8266.dumpSocketInfo();
      console.log({log:ESP8266.printLog()});
      res.writeHead(200);
       res.write('<html><body>');
       res.write(Date.now());
      res.write("<br>Hello World");
      res.write("<br>socket:");
      res.write(server.sckt);
      res.end('</body></html>');

    });
    server.listen(PORT);
  }
);

@tve

I think the issue is here:

//===== Wifi soft_init
// This function is called in soft_init to hook-up the network. This happens from user_main's
// init_done() and also from `reset()` in order to re-hook-up the network.
void jswrap_ESP8266_wifi_soft_init() {
DBGV("> Wifi.soft_init\n");
// initialize the network stack
netInit_esp8266_board();
JsNetwork net;
networkCreate(&net, JSNETWORKTYPE_ESP8266_BOARD);
networkState = NETWORKSTATE_ONLINE;
DBGV("< Wifi.soft_init\n");
}

//===== Wifi soft_init

// This function is called in soft_init to hook-up the network. This happens from user_main's
// init_done() and also from `reset()` in order to re-hook-up the network.
void jswrap_ESP8266_wifi_soft_init() {
  DBGV("> Wifi.soft_init\n");

  // initialize the network stack
  netInit_esp8266_board();
  JsNetwork net;
  networkCreate(&net, JSNETWORKTYPE_ESP8266_BOARD);
  networkState = NETWORKSTATE_ONLINE;

  DBGV("< Wifi.soft_init\n");
}

which calls:
https://github.com/espruino/Espruino/blob/master/libs/network/esp8266/network_esp8266.c#L433-L437

/**
 * Initialize the entire socket array
 */
void netInit_esp8266_board() {
  if (g_socketsInitialized) return;
  g_socketsInitialized = true;
  os_memset(socketArray, 0, sizeof(socketArray));
}

and since g_socketsInitialized is already true - the socket array is not re-initialised.

However, I would have thought the slots should be in the SOCKET_STATE_UNUSED and would be available?

@wilberforce
Copy link
Member

Ahh - worked out how to reproduce....

It's when you get an inbound connection, while the script is being up loaded. So this naturally when you are using web sockets -as there is a regular heartbeat.

So start the ide upload and refresh the browser.... after a few iterations, the socketArray is full:


ERROR: Unable to create socket
179518> === socket 1 type=server, txBuf=0x00000000, state=idle, espconn=0x3fff9158, err=0, rx:
179526> === socket 30 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=-5, rx: 416@0x3fffa9e8
179536> === socket 31 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=0, rx:
179545> === socket 46 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=0, rx: 416@0x3fffad48
179555> === socket 47 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=0, rx:
179563> === socket 50 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=0, rx: 416@0x3fffab98
179573> === socket 52 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=0, rx: 416@0x3fffaef8
179584> === socket 54 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=0, rx: 416@0x3fffb0a8
179594> === socket 55 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=0, rx: 416@0x3fffb258
179604> === socket 57 type=inbound, txBuf=0x00000000, state=closing, espconn=0x00000000, err=0, rx: 416@0x3fffb408

@tve
Copy link
Contributor Author

tve commented Jun 25, 2016

and since g_socketsInitialized is already true - the socket array is not re-initialised.

IIRC that's because the telnet console is also one of the sockets, so you don't want to blow that away...

@tve
Copy link
Contributor Author

tve commented Jun 25, 2016

@davidmoshal I cannot reproduce your problem. I'm running curl pulling the hello world page in a loop running as fast as it will go (several requests per second) and it's already been running for a few minutes without any issue.

@tve
Copy link
Contributor Author

tve commented Jun 25, 2016

@wilberforce are you saying that if you upload code while a socket is open then the socket does not get closed? That's possible.

@davidmoshal
Copy link

davidmoshal commented Jun 25, 2016

@tve are you uploading code at the same time?

@wilberforce 's problem seemed exactly the same as mine, so I ran his code, and I get the same error he got - socket count increases until they are dropped:

243380> net_esp8266: listening socket 30 on port 8080
243445> === socket 1 type=server, txBuf=0x00000000, state=idle, espconn=0x3fffa448, err=0, rx:
243445> === socket 2 type=server, txBuf=0x00000000, state=disconnecting, espconn=0x3fffa5a0, err=0, rx:
243452> === socket 5 type=server, txBuf=0x00000000, state=disconnecting, espconn=0x3fffa658, err=0, rx:
243461> === socket 10 type=server, txBuf=0x00000000, state=disconnecting, espconn=0x3fffa960, err=0, rx:
243470> === socket 13 type=server, txBuf=0x00000000, state=disconnecting, espconn=0x3fffa9b0, err=0, rx:
243479> === socket 16 type=server, txBuf=0x00000000, state=disconnecting, espconn=0x3fffaa00, err=0, rx:
243488> === socket 21 type=server, txBuf=0x00000000, state=disconnecting, espconn=0x3fffaa50, err=0, rx:
243497> === socket 24 type=server, txBuf=0x00000000, state=disconnecting, espconn=0x3fffae80, err=0, rx:
243507> === socket 27 type=server, txBuf=0x00000000, state=disconnecting, espconn=0x3fffaed0, err=0, rx:
243516> === socket 30 type=server, txBuf=0x00000000, state=idle, espconn=0x3fffaf20, err=0, rx:
243524> net_esp8266: out of sockets, dropping inbound connection
=undefined
> 

Note: in my original case I noticed that even without uploading code, the browser seemed to still be waiting after res.end("Hello World")
Shouldn't the socket close after `res.end()'?

@wilberforce
Copy link
Member

wilberforce commented Jun 25, 2016

@wilberforce are you saying that if you upload code while a socket is open then the socket does not get closed? That's possible.

@tve
It seems so. It is quite hard to reproduce this, I've not managed to get it down to a simple case that is reproducible ;-(

I have been doing the tests, uploading with serial. If you upload over a wifi connection it does not seem to occur...

@davidmoshal
Copy link

@tve I was able to reproduce it easily - note: I had to upload the code and refresh the browser about 10 times.
Separately: where are the instructions for uploading over a wifi connection?

David

@MaBecker
Copy link
Contributor

This is a simple way howto hammer the ESPRUINO webserver using curl

How to quickly stress test a web server

curl "http://<esp_ip>:8080?[1-100]"

Next level is to use ab Apache HTTP server benchmarking tool

using this allows you to quickly reproduce out of sockets and prints some stats

ab -c 10 -n 10  "http://192.168.194.64:8080/"
This is ApacheBench, Version 2.3 <$Revision: 1663405 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.194.64 (be patient).....done


Server Software:        Espruino
Server Hostname:        192.168.194.64
Server Port:            8080

Document Path:          /
Document Length:        86 bytes

Concurrency Level:      10
Time taken for tests:   1.431 seconds
Complete requests:      10
Failed requests:        0
Total transferred:      1480 bytes
HTML transferred:       860 bytes
Requests per second:    6.99 [#/sec] (mean)
Time per request:       1430.944 [ms] (mean)
Time per request:       143.094 [ms] (mean, across all concurrent requests)
Transfer rate:          1.01 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        5    9   2.9      9      14
Processing:   304  862 365.1    945    1422
Waiting:      304  862 364.7    944    1419
Total:        309  871 367.3    956    1431

Percentage of the requests served within a certain time (ms)
  50%    956
  66%   1080
  75%   1203
  80%   1208
  90%   1431
  95%   1431
  98%   1431
  99%   1431
 100%   1431 (longest request)

@MaBecker
Copy link
Contributor

@tve is the espressif SDK the root cause ?

@MrTimcakes
Copy link

Another easy way to reproduce this issue is to use the Ping Module and spam out ping requests, you soon slam into the error.

@MaBecker
Copy link
Contributor

MaBecker commented Sep 4, 2016

running a sensor, that sends every 5min data to thingspeak.com ( chart )

  • connect to WebIDE by /dev/cu.usbserial: getting socket errors from time to time : gaps in graph
  • disconnected from WebIDE: no more socket error : no gaps in graph

Is there a context ?

@MaBecker
Copy link
Contributor

MaBecker commented Sep 4, 2016

new gap because sensor moved to a different location - restarted at 15:54 GMT+0200

@MaBecker
Copy link
Contributor

Hi, can some one please explain this output line marke with <-- ?

84904> E:M 656
784905> E:M 656
784906> E:M 656
784908> E:M 656
784909> E:M 656
784911> E:M 656
784912> E:M 656
784914> net_esp8266: error -1->-2 on socket 805: out of memory   <-- ?
784920> net_esp8266: socket 805 to be closed
784924> net_esp8266: socket 805 disconnected
788705> === socket 1 type=server, txBuf=0x00000000, state=idle, espconn=0x3fff9810, err=0, rx:
788705> === socket 450 type=server, txBuf=0x00000000, state=idle, espconn=0x3fff9f18, err=-5, rx:
788710> === socket 0 type=none, txBuf=0x00000000, state=unused, espconn=0x00000000, err=0, rx:
.....

this is the used command to stress the sockets:

watch -n 5 ab -s 5 -c 5 -n 5 http://192.168.194.50:8080/ 

@tve
Copy link
Contributor Author

tve commented Sep 23, 2016

The problem starts way before the highlighted line. E:M 656 is printed by malloc and means that it failed to allocate 656 bytes, which must be a TCP buffer. By the looks of it the alloc was called by the SDK, e.g. lwip. The line you highlighted is when the espruino networking code gets the error.
So the bottom line is that you're running out of heap...

@msol2004
Copy link

Hello,
is "ERROR: Unable to create socket" solved?
We have Espruino on ESP8266 1v88 and got this error under net.connect()

@MaBecker
Copy link
Contributor

no - not as far as I know

@msol2004
Copy link

We have very periodic problem related to "ERROR: Unable to create socket".
- Can it generate any on.error message onto JS level?
- Could it be fixed in the next release?
- I would like to help with this somehow... Please give advice

@MaBecker
Copy link
Contributor

Hi @msol2004

  • this seems to be a bug in the ESP8266 network implementation or memory issue

- Can it generate any on.error message onto JS level?

  • I would say no, the error happens on lower layer

- Could it be fixed in the next release?

  • Only if the ESP8266 user fix it, unsupported board etc....

- I would like to help with this somehow... Please give advice

  • setup a the toolchain to build your own

  • add as many os_print() you need

  • eg libs/network/esp8266/network_esp8266.c

  • activate login on D2 and connect a RX to TRX1 (D2)

require("ESP8266").setLog(3) 
  • and try to nail this down

@tve and @wilberforce would be nice if you can add your findings and suggestions

@wilberforce
Copy link
Member

@MaBecker

Sorry, I can really add much.

I've found the esp8266 do be a bit limiting when I try to do anything, so I'm focusing efforts on the esp32 port. There are much more resources available and these type of issues are far less likely to occur.

@jumjum123 and I have been working on the esp32 port, there is a esp32 branch and and gitter.im room.

@msol2004
Copy link

I am in dead end with "setup a the toolchain".

I am focused in Eclipse dev.tools and, unfortunately, I still cannot find the right manual "how to compile Espruino project". Any guidance will be very appreciated!

@MaBecker
Copy link
Contributor

check README_Building.md and let me know if you need more informations

@msol2004
Copy link

msol2004 commented Jan 24, 2017

Thank you for your attention!

Before your answer, I have found recommendations for "UNOFFICIAL DEVELOPMENT KIT FOR ESPRESSIF ESP8266" and followed them.

Very likely, I have installed all things properly, because two sample projects (hello_world and lwip_open_demo_app) have been downloaded into ESP-12F and functioned as expected.

Now I am trying to find proper way according to instructions in README_Building.md and README_BuildProcess.md files. I met several questions after the reading README_Building.md under the line "for esp8266":
- where can "STANDALONE=n" be found?
- I connot find "esp_iot_sdk_v1.5.0" file. Does it exist for Windows?
- should I somehow edit lines like "ESP_FLASH_MAX ?= 479232", because "?=" in the middle?

Anyway, I have uncommented the line ESP8266_BOARD=1 and added
FLASH_4MB=1 # if you have an esp-12
ESP8266_SDK_ROOT=c:/Espressif/esp_iot_sdk_v1.5.0 # ????
PATH=$PATH;c:/Espressif/xtensa-lx106-elf/bin/
COMPORT=COM14
include ../settings.mk
MODULE = driver user
LIBS = c gcc hal phy pp net80211 lwip wpa main crypto
include ../common_nonos.mk

Log file of my first attempt to build firmware:

21:36:24 **** Build of configuration Default for project Espruino ****
make all
make: Nothing to be done for `all'.
21:37:05 Build Finished (took 41s.432ms)

Build report:

Description Resource Path Location Type
make: *** No rule to make target `Espruino_GM'. Stop. Espruino C/C++ Problem

Could you guide me onto the next step?

@MaBecker
Copy link
Contributor

please also check Dockerfile-esp8266

My Env setting:

export ESP8266_BOARD=1
export FLASH_4MB=1
export ESP8266_SDK_ROOT=/esptools/esp_iot_sdk/sdk
export PATH=$PATH:/esptools/esp-open-sdk/xtensa-lx106-elf/bin/
export COMPORT=/dev/cu.usbserial

clone and install flash tool esptool

git clone https://github.com/espressif/esptool

clone Espruino

git clone https://github.com/espruino/Espruino.git

cd Espruino

make 

check README_flash.txt to flash your new binaries

Please let me know if you need more support

@msol2004
Copy link

Thank you, it works!

@MaBecker
Copy link
Contributor

@msol2004 any news on this?

@maze1980
Copy link

maze1980 commented Apr 6, 2017

The ESP8266 supports only 15 sockets, that's a hardware limitation, and to make it worse the current build of Espruino supports only 10 sockets. I can't see any indication of any other error in the messages in this thread, it looks like you're simply hitting that limit to 10 sockets.

Just be clarify, maybe I didn't understand the error report correctly: As far as I understand the sockets get closed eventually, and new connections are possible again ('some gaps in the graphs'). Is this correct?

The only thing I don't understand: How could @tve possibly have curl running for minutes without getting any error - was the web server really running on the ESP8266?

@gfwilliams
Copy link
Member

As far as I understand the sockets get closed eventually, and new connections are possible again ('some gaps in the graphs'). Is this correct?

Yes, absolutely! They should get closed pretty quickly.

The only thing I don't understand: How could @tve possibly have curl running for minutes without getting any error

Because sockets get closed after being used. If they're not getting closed you're doing something wrong, and with 50 sockets you'll probably just run out of RAM before anything else.

@maze1980
Copy link

maze1980 commented Apr 7, 2017

The only thing I don't understand: How could @tve possibly have curl running for minutes without getting any error

Because sockets get closed after being used. If they're not getting closed you're doing something wrong, and with 50 sockets you'll probably just run out of RAM before anything else.

Just checked curl: curl does actively close the connection on the client side, therefore the server can re-use the socket without any wait time, and this is most likely why it worked fine. But if the client doesn't actively close the session the server has to do close the session. This can take a while (4 minutes typically in TIMEWAIT state).

@gfwilliams
Copy link
Member

Espruino doesn't support socket re-use, so I believe it closes the socket itself once it's served up the page.

What exact problems are you having with Espruino on ESP8266, and with what code?

@maze1980
Copy link

maze1980 commented Apr 7, 2017

Sorry, I used the wrong terminology. It's re-using one of the 10 available structures for sockets (as defined in MAX_SOCKETS) to create a new socket, not the socket itself.
I don't have any problem with sockets, and I just wondered if there is really a bug. To me everything looks like it works as designed. (The limit to 10 sockets is just a limitation, not a bug).

@gfwilliams
Copy link
Member

I don't have any problem with sockets, and I just wondered if there is really a bug.

Me too :) I think what happened was there was historically a bug where Espruino was 'leaking' open sockets in some cases, and then a bug in the ESP8266 SDK where it did the same when low on memory, but it sounds a lot like this has now been fixed in both cases.

I'll just close this now I think and will see if anyone reopens with with a legitimate report of leaking sockets, and not an 'Unable to create socket' that's just caused by trying to open too many sockets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ESP8266 This is only a problem on ESP8266 devices
Projects
None yet
Development

No branches or pull requests

8 participants