Conrad Herrmann
2015-06-14 09:00:35 UTC
Martin,
I have recently run into the same problem.
I think the problem is that SmbFile.resolveDfs() uses the currently
connected transport as the DFS resolver/domain server
(tree.session.transport.tconHostName), but it is very possible that there is
no currently connected transport. In your case, that happens when the file
server forces the TCP connection to close, and the transport tears itself
down.
And, although the resolveDfs() method calls connect0(), in fact this does
nothing because doConnect() doesn't force creation of a new connection if we
are talking about a DFS resolved path.
It seems to me that in that case, we have to start over again at the top of
the referral tree, with the Domain.
So my solution has this: change the code for SmbFile.resolveDfs() lines 671
(or so) so that it says:
connect0();
unc,
auth);
The code comes from the other use of tconHostName, in SmbFile.doConnect():
String hostName = getServerWithDfs();
tree.inDomainDfs = dfs.resolve(hostName, tree.share, null, auth) !=
null;
In this code, we are getting the DFS resolver (which might be the domain
server) as the hostName, and asking it to resolve our share.
Basically what this new code is saying is that:
- in the case where the transport has closed (ie, because of a timeout or
TCP close on the DFS server side) reconnect to the DFS domain server in
order to resolve a share's DFS server.
I can imagine a case where this doesn't work--if we have multiple levels of
DFS redirection, where the domain server cannot redirect the client to a
deep subdirectory. But, I don't even know if this is possible in DFS. If
it is, then at least this solution removes the top level case and identifies
the problem, which would require walking down the DFS resolution path to
resolve the actual file server.
Conrad Herrmann
Primdaesk, Inc.
I have recently run into the same problem.
I think the problem is that SmbFile.resolveDfs() uses the currently
connected transport as the DFS resolver/domain server
(tree.session.transport.tconHostName), but it is very possible that there is
no currently connected transport. In your case, that happens when the file
server forces the TCP connection to close, and the transport tears itself
down.
And, although the resolveDfs() method calls connect0(), in fact this does
nothing because doConnect() doesn't force creation of a new connection if we
are talking about a DFS resolved path.
It seems to me that in that case, we have to start over again at the top of
the referral tree, with the Domain.
So my solution has this: change the code for SmbFile.resolveDfs() lines 671
(or so) so that it says:
connect0();
String hostName = tree.session.transport.tconHostName;
String domainDfsServerName = getServerWithDfs();
if (hostName == null)
hostName = domainDfsServerName;
DfsReferral dr = dfs.resolve(String domainDfsServerName = getServerWithDfs();
if (hostName == null)
hostName = domainDfsServerName;
hostName,
tree.share,unc,
auth);
The code comes from the other use of tconHostName, in SmbFile.doConnect():
String hostName = getServerWithDfs();
tree.inDomainDfs = dfs.resolve(hostName, tree.share, null, auth) !=
null;
In this code, we are getting the DFS resolver (which might be the domain
server) as the hostName, and asking it to resolve our share.
Basically what this new code is saying is that:
- in the case where the transport has closed (ie, because of a timeout or
TCP close on the DFS server side) reconnect to the DFS domain server in
order to resolve a share's DFS server.
I can imagine a case where this doesn't work--if we have multiple levels of
DFS redirection, where the domain server cannot redirect the client to a
deep subdirectory. But, I don't even know if this is possible in DFS. If
it is, then at least this solution removes the top level case and identifies
the problem, which would require walking down the DFS resolution path to
resolve the actual file server.
Conrad Herrmann
Primdaesk, Inc.
Hi,
I encountered a NullpointerException similar to
https://lists.samba.org/archive/jcifs/2012-January/009856.html - at
least the stack traces are similar.
- jcifs 1.3.18
- IBM JDK 7
- AIX 7.1
The NPE occured in a (in-house) plugin for the Jenkins build server. In
this system, JCIFs is used to recursively copy files from a Windows
share to an AIX machine.
Re-running a build shortly after it finished triggered the NPE.
After some debugging, it seems to me like the SmbFile's underlying
transport is closed (by timeout), and when SmbFile.resolveDfs is called,
the transport is not reconnected (unlike, for example, later in
SmbFile.resolve, or in SmbSession.getChallenge).
I was able to reproduce the NPE during debugging using the following
- Trigger a build (recursively copying from a CIFS DFS tree)
- Wait until the transport objects disconnect by timeout (tracked by
breakpoint)
- Retrigger the build (recursively copying the same directory structure)
The Jenkins plugin usually runs the second JCIFS copy operation in the
same thread than the first (though that's not guaranteed).
Each run uses a new SmbFile object.
Am I missing something (like some close operation on SmbFile)?
Is this a known error?
Can I do something to fix it?
Best regards,
Martin
I encountered a NullpointerException similar to
https://lists.samba.org/archive/jcifs/2012-January/009856.html - at
least the stack traces are similar.
- jcifs 1.3.18
- IBM JDK 7
- AIX 7.1
The NPE occured in a (in-house) plugin for the Jenkins build server. In
this system, JCIFs is used to recursively copy files from a Windows
share to an AIX machine.
Re-running a build shortly after it finished triggered the NPE.
After some debugging, it seems to me like the SmbFile's underlying
transport is closed (by timeout), and when SmbFile.resolveDfs is called,
the transport is not reconnected (unlike, for example, later in
SmbFile.resolve, or in SmbSession.getChallenge).
I was able to reproduce the NPE during debugging using the following
- Trigger a build (recursively copying from a CIFS DFS tree)
- Wait until the transport objects disconnect by timeout (tracked by
breakpoint)
- Retrigger the build (recursively copying the same directory structure)
The Jenkins plugin usually runs the second JCIFS copy operation in the
same thread than the first (though that's not guaranteed).
Each run uses a new SmbFile object.
Am I missing something (like some close operation on SmbFile)?
Is this a known error?
Can I do something to fix it?
Best regards,
Martin