On the latter point, the test fails to specify what to do when a computer fails to imitate a man imitating a woman.
The Colby argument
"As an experimental design for a validation procedure there are a number of weaknesses in the game Turing proposed. The dimension of 'womanlikeness' is too vague a conceptual dimension to make a judgment about using purely verbal information. There are no known criteria for identifying women over teletypes. An ability to deceive on the part of the man is required and the ordinary man may have no skill at this. (Why not use professional female impersonators? But are they really men?) Finally, since the variable is dichotomous, if a computer fails to imitate a man imitating a woman, then is it a successful imitation of a man? From these considerations, we believe that the simple Imitation Game is a weak test"
Kenneth Colby, F. Hilf, S. Weber, and H. Kraemer, (1972, p. 202), "Turing-Like Indistinguishability Tests for Validation of a Computer Simulation of Paranoid Processes," Artificial Intelligence 3, pp. 199-221.