We have a dozen or more CR1000 loggers, a few dozen CR10x’s and even still some CR21X’s that are still chugging along. Well done!
Anyway the CR1000’s are connected via an IP network built with FreeWave 900MHz radios.
The problem is that we are experiencing some ‘lockup’ issues that we think are related to the CR1000/NL115 combination and we haven’t been able to pinpoint exactly what is going on. (BTW there are only CR1000 loggers on the IP network.)
Two remote sites both with CR1000’s/NL115 NICs attached via FreeWave radios have independently lost connections to LoggerNet after 6+ months of operation.
(The rest of the IP network functions perfectly, the radio that is connected to the “locked-up” NL115 can still be interrogated and even still passes data further down the chain, you can even plug your laptop into the switch and communicate in and out of the site.)
But you cannot ping the logger or communicate with it in anyway via IP.
Upon arrival both the Radio and the NL115 have link lights on with activity lights occasionally blinking, which seems weird to us as you cannot connect to the Logger.
The good news (I guess) is that upon arrival at the site we are able to connect directly to the CR1000 with LoggerNet via both the RS232 and the CS I/O and the logger seems fine, measurements are still being made, all data is present and correct.
Device Configuration Utility connected to the logger even still shows the correct static IP address.
Using a network cable to connect directly from the CR1000/NL115 to laptop a 10MB/s link is established however the logger can’t be pinged.
Removing the Ethernet cable from the NL115 and plugging it into another CR1000/NL115 combination immediately allows connection to the new setup so there seems to be absolutely nothing wrong on the radio/cable side.
Removing the NL115 and replacing it doesn’t help either.
One thing we have just thought of that we haven’t tried is restarting the logger program (via serial) to see if this frees up the IP connection.
The only solution that has worked so far to restore service is to remove and reapply the power to the logger.
As you can imagine this is a disastrous situation for us and our client as both sites are helicopter access only and the data loss and expense cannot be tolerated.
All CR1000’s are running CR1000.Std.17 firmware, however we are gradually updating to 18.
LoggerNet connects to all loggers at 1 and 10 minute intervals & collects data – no call-back or other IP commands are used.
Any thoughts would be gratefully received.
Regards
Stewart
* Last updated by: MilfordRoadNewZealand on 3/3/2010 @ 10:33 AM *
新对话如下:
Do you see the same issue with OS17 and 18?
Can you tell us if you use ftpclient commands in your program (I think you say not) OR TCPClose?
If you restart the program in the logger I am sure it will reset and clear issue you have as that will reset the TCP/IP stack and restart it. However, that is no long term solution as you clearly can't do that remotely if you cannot connect to it.
Sorry you are having this problem.
新对话如下:
well, it seems it is the same problem that occures at our sites. I was already in contact with Janet (http://www.campbellsci.com/forum/messages.cfm?threadid=E1704294-D262-1F8D-D74C4C792F358DC8)
We are using the NL120 & CR1000. The behaviour is almost the same as you described it. The device that is between LoggerNetServer and NL120 is working, but it is not possible to access the logger via this connection. If you are at the site you can connect to the datalogger using RS232/Com3/CSI/O. Only possibility is to apply an hardware Reset to the logger.
Our program uses FTPClient and TCPOpen. I try to workaround the problem by avoiding running these tasks concurrently.
Johannes
新对话如下:
Can I apologise again for the issues you are seeing, there have been a few issues with ftpclient, tcpopen and the tablefile command that have been fixed in recent operating systems which could cause issues with TCP/IP traffic.
Very soon, another update to the operating systems (19 for the CR1000) will be published on the web site that should, hopefully, nail the last of these issues.
I am a little concerned though with the report from NZ though as there could still be some other issue if that application does not use ftpclient, tablefile or similar. We'd need to discuss with you some further diagnostic steps, before resetting the logger, to check this out if you have a logger in a locked up state and are visiting it to get it back on line.
Please contact Janet or myself in this instance.
新对话如下:
Hi folks,
I can now confirm that restarting the logger program via serial does indeed "free-up" the IP again thus allowing communication to the logger via IP without removing & re-applying the power. This is not at all perfect and not a solution for us at all but is another piece of useful information.
The programs in the "locked-up" loggers don't use FTPClient or TCPOpen. The closest thing is another logger that does lockup occasionally that does use "PingIP", however this logger seems to clear itself. Its running OS15 or 16 I think, I will confirm later. Sometimes when this CR1000 locks up LoggerNet3 is implicated and has also locked up and will not close without shutting down the process. This is on a different network to the others. “This is another whole new story, which I won’t go into just yet”
We’ve only just started deploying OS18 in the last month and it’s only in a couple of loggers so far so I couldn’t categorically say that it has helped yet.
Please send any "further diagnostic steps" that you would like us to take if this happens again.
Do you know when OS19/CR1000 will be available?
Cheers
Stewart
新对话如下:
Version 19 has been released into production. It should be only a matter of days till it hits the website.
If you cannot wait please email me and I can also give you the further diagnostic steps. (andrew dot sandford at campbellsci dot co dot uk)
Getting the file from the website may be worth waiting for as a full update of all your loggernet files will be installed that way.
新对话如下:
I may be having a similar issue with a CR1000/NL115 that's acting as a ModBus traffic manager for an energy monitoring application. It runs fine for a month or two then ceases to acquire the data any longer from the ModbusMaster side of the installation. The program opens 5 sockets with TCPOpen to talk to the meters and the CR1000 itself is a ModBusSlave over the network. It's just recently locked up again with OS17. I just updated the logger to OS18 about 2 weeks ago and so far it appears to be working. There's never an issue communicating with the Logger remotely just with the acquisition of data over the TCP.
Regards,
IslandMan
新对话如下:
This is a few years down the track to open such an old topic.
I am seeing the same thing on OS 32.02. It seems like the lockup occurs after a substantial amount of data is transferred. The only way I have been able to re-establish an IP connection is via modem power-cycle (unfortunately this almost 1700km drive away).
Was a programmatic solution found for this?
I have tried using EthernetPower() to toggle the power to the module, and forcing a re-compile of the program, but neither solution appears to be working.
I currently have a radio link available to the site (via a pretty tenuous connection from a similar site), however require the IP link for data retrieval and TCP callbacks.
Regards,
Ash
新对话如下:
Ash- Although a programatic solution is preferred, Pordis has a plug-in acessory to periodically reset the modem (or CR1000, if necessary) which has been used on many cell and satellite modems to recover for this and related issues. Typically a 2-second power cycle every 24 hours does the trick but any off time or on duration is possible. Just to let you know an alternative solution is available if you are unable to find a software solution. This is their model 160A.
新对话如下:
AHann, how is your data being transeferred? Is it by PakBus and LoggerNet, HTTP(S), or FTP?
新对话如下:
Programmatic reset can be done with:
FileManage(Status.ProgName,6)
That will just restart the current program. In most cases, that would be sufficient. It will basically be like cycling power of just the datalogger.
新对话如下:
@GaryTRoberts, we are using a combination of transfer protocols. FTP for image transfer, TCP callback for data, Pakbus (over TCP) for comms between sites and NTP to keep other IP devices in time sync with the logger.
The lockup occured just after the logger attempted to clear a backlog of image files, probably a couple of MB worth.
It looks like this has since reset itself, and the site has come back online in time for its midnight check-in.
It seems like something is periodically released to clear and re-enable the IP-Stack, but I can't find a suitable programmatic call to do this on demand. We will get in a bit of hot water if it locks up during an event, and we miss data untill midnight..
@JDavis tried restarting the program via Restart() and ConstTable.ApplyAndRestart and sending a new program, none of these solutions immediately released the lock.
@RyanSmith Thanks, we have investigated timed relays in the past, but given how reliable CSI hardware ussually is we were hoping to avoid this.