[jcifs] NullPointerException in Dfs.resolve

Discussion:

Conrad Herrmann

2015-06-14 09:00:35 UTC

Martin,

I have recently run into the same problem.

I think the problem is that SmbFile.resolveDfs() uses the currently
connected transport as the DFS resolver/domain server
(tree.session.transport.tconHostName), but it is very possible that there is
no currently connected transport. In your case, that happens when the file
server forces the TCP connection to close, and the transport tears itself
down.

And, although the resolveDfs() method calls connect0(), in fact this does
nothing because doConnect() doesn't force creation of a new connection if we
are talking about a DFS resolved path.

It seems to me that in that case, we have to start over again at the top of
the referral tree, with the Domain.

So my solution has this: change the code for SmbFile.resolveDfs() lines 671
(or so) so that it says:

connect0();

String hostName = tree.session.transport.tconHostName;
String domainDfsServerName = getServerWithDfs();
if (hostName == null)
hostName = domainDfsServerName;

DfsReferral dr = dfs.resolve(

hostName,

tree.share,

unc,

auth);

The code comes from the other use of tconHostName, in SmbFile.doConnect():

String hostName = getServerWithDfs();

tree.inDomainDfs = dfs.resolve(hostName, tree.share, null, auth) !=
null;

In this code, we are getting the DFS resolver (which might be the domain
server) as the hostName, and asking it to resolve our share.

Basically what this new code is saying is that:

- in the case where the transport has closed (ie, because of a timeout or
TCP close on the DFS server side) reconnect to the DFS domain server in
order to resolve a share's DFS server.

I can imagine a case where this doesn't work--if we have multiple levels of
DFS redirection, where the domain server cannot redirect the client to a
deep subdirectory. But, I don't even know if this is possible in DFS. If
it is, then at least this solution removes the top level case and identifies
the problem, which would require walking down the DFS resolution path to
resolve the actual file server.

Conrad Herrmann

Primdaesk, Inc.

Hi,
I encountered a NullpointerException similar to
https://lists.samba.org/archive/jcifs/2012-January/009856.html - at
least the stack traces are similar.
- jcifs 1.3.18
- IBM JDK 7
- AIX 7.1
The NPE occured in a (in-house) plugin for the Jenkins build server. In
this system, JCIFs is used to recursively copy files from a Windows
share to an AIX machine.
Re-running a build shortly after it finished triggered the NPE.
After some debugging, it seems to me like the SmbFile's underlying
transport is closed (by timeout), and when SmbFile.resolveDfs is called,
the transport is not reconnected (unlike, for example, later in
SmbFile.resolve, or in SmbSession.getChallenge).
I was able to reproduce the NPE during debugging using the following
- Trigger a build (recursively copying from a CIFS DFS tree)
- Wait until the transport objects disconnect by timeout (tracked by
breakpoint)
- Retrigger the build (recursively copying the same directory structure)
The Jenkins plugin usually runs the second JCIFS copy operation in the
same thread than the first (though that's not guaranteed).
Each run uses a new SmbFile object.
Am I missing something (like some close operation on SmbFile)?
Is this a known error?
Can I do something to fix it?
Best regards,
Martin

Michael B Allen

2015-06-14 15:37:48 UTC

Permalink

I have added this post to the list of people who have reported it to
the TODO list so that it can be considered when I get around to
looking at this.

Note that the 1.3.18b mentioned in the link cited is here:

http://jcifs.samba.org/old/jcifs-1.3.18b.jar

Although I cannot recall what it actually does anymore it might be
worth a try. We never received feedback about it.

Mike

Post by Conrad Herrmann
Martin,
I have recently run into the same problem.
I think the problem is that SmbFile.resolveDfs() uses the currently
connected transport as the DFS resolver/domain server
(tree.session.transport.tconHostName), but it is very possible that there is
no currently connected transport. In your case, that happens when the file
server forces the TCP connection to close, and the transport tears itself
down.
And, although the resolveDfs() method calls connect0(), in fact this does
nothing because doConnect() doesn't force creation of a new connection if we
are talking about a DFS resolved path.
It seems to me that in that case, we have to start over again at the top of
the referral tree, with the Domain.
So my solution has this: change the code for SmbFile.resolveDfs() lines 671
connect0();

String hostName = tree.session.transport.tconHostName;
String domainDfsServerName = getServerWithDfs();
if (hostName == null)
hostName = domainDfsServerName;

DfsReferral dr = dfs.resolve(

hostName,

tree.share,
unc,
auth);
String hostName = getServerWithDfs();
tree.inDomainDfs = dfs.resolve(hostName, tree.share, null, auth) !=
null;
In this code, we are getting the DFS resolver (which might be the domain
server) as the hostName, and asking it to resolve our share.
- in the case where the transport has closed (ie, because of a timeout or
TCP close on the DFS server side) reconnect to the DFS domain server in
order to resolve a share's DFS server.
I can imagine a case where this doesn't work--if we have multiple levels of
DFS redirection, where the domain server cannot redirect the client to a
deep subdirectory. But, I don't even know if this is possible in DFS. If
it is, then at least this solution removes the top level case and identifies
the problem, which would require walking down the DFS resolution path to
resolve the actual file server.
Conrad Herrmann
Primdaesk, Inc.

Hi,
I encountered a NullpointerException similar to
https://lists.samba.org/archive/jcifs/2012-January/009856.html - at
least the stack traces are similar.
- jcifs 1.3.18
- IBM JDK 7
- AIX 7.1
The NPE occured in a (in-house) plugin for the Jenkins build server. In
this system, JCIFs is used to recursively copy files from a Windows
share to an AIX machine.
Re-running a build shortly after it finished triggered the NPE.
After some debugging, it seems to me like the SmbFile’s underlying
transport is closed (by timeout), and when SmbFile.resolveDfs is called,
the transport is not reconnected (unlike, for example, later in
SmbFile.resolve, or in SmbSession.getChallenge).
I was able to reproduce the NPE during debugging using the following
- Trigger a build (recursively copying from a CIFS DFS tree)
- Wait until the transport objects disconnect by timeout (tracked by
breakpoint)
- Retrigger the build (recursively copying the same directory structure)
The Jenkins plugin usually runs the second JCIFS copy operation in the
same thread than the first (though that's not guaranteed).
Each run uses a new SmbFile object.
Am I missing something (like some close operation on SmbFile)?
Is this a known error?
Can I do something to fix it?
Best regards,
Martin

--
Michael B Allen
Java Active Directory Integration
http://www.ioplex.com/

Martin Kutter

2015-06-15 11:45:22 UTC

Permalink

Thanks for your reply.
I've tested 1.3.18b - it does not fix this specific error.

Regarding the fix suggested by Conrad: In the meantime, I've also
implemented a fix for the issue - I changed SmbFile.resolveDfs()
(lines 671 and following) to

DfsReferral dr = null;
// disconnect is synchronized to transport, too.
// make sure our transport doesn't get disconnected while
// we're inside the synchronized block
synchronized (tree.session.transport) {
if (tree.session.transport.tconHostName == null) {
// disconnect properly if connection is lost
tree.treeDisconnect(false);
}
tree.session.transport.connect();
dr = dfs.resolve(tree.session.transport.tconHostName,
tree.share,
unc,
auth);
}
if (dr != null) {

As my knowledge of CIFS is quite limited, I have no idea whether this
is right (or has bad side effects).

The idea behind is similar (but not equal) to Conrad's fix: In case the
transport has disconnected, disconnect properly and reconnect.

Best regards,

Martin

Post by Michael B Allen
I have added this post to the list of people who have reported it to
the TODO list so that it can be considered when I get around to
looking at this.
http://jcifs.samba.org/old/jcifs-1.3.18b.jar
Although I cannot recall what it actually does anymore it might be
worth a try. We never received feedback about it.
Mike

top