Artificial neural networks were used to support applications across a variety of business and scientific disciplines during the past years.
Artificial neural network applications are frequently viewed as black boxes which mystically determine complex patterns in data. Contrary to
this popular view, neural network designers typically perform extensive knowledge engineering and incorporate a significant amount of
domain knowledge into artificial neural networks. This paper details heuristics that utilize domain knowledge to produce an artificial neural
network with optimal output performance. The effect of using the heuristics on neural network performance is illustrated by examining
several applied artificial neural network systems. Identification of an optimal performance artificial neural network requires that a full
factorial design with respect to the quantity of input nodes, hidden nodes, hidden layers, and learning algorithm be performed. The heuristic
methods discussed in this paper produce optimal or near-optimal performance artificial neural networks using only a fraction of the time
needed for a full factorial design.
Verner, June M., Evanco, William M. and Cerpa, Narciso (2007): State of the practice: how important is effort estimation to software development success?. In Information and Software Technology, 49 (2) pp. 181-193.
During discussions with a group of U.S. software developers we explored the effect of schedule estimation practices and their implications
for software project success. Our objective is not only to explore the direct effects of cost and schedule estimation on the perceived
success or failure of a software development project, but also to quantitatively examine a host of factors surrounding the estimation issue
that may impinge on project outcomes. We later asked our initial group of practitioners to respond to a questionnaire that covered some
important cost and schedule estimation topics. Then, in order to determine if the results are generalizable, two other groups from the US
and Australia, completed the questionnaire. Based on these convenience samples, we conducted exploratory statistical analyses to identify
determinants of project success and used logistic regression to predict project success for the entire sample, as well as for each of the
groups separately. From the developer point of view, our overall results suggest that success is more likely if the project manager is
involved in schedule negotiations, adequate requirements information is available when the estimates are made, initial effort estimates
are good, take staff leave into account, and staff are not added late to meet an aggressive schedule. For these organizations we found
that developer input to the estimates did not improve the chances of project success or improve the estimates. We then used the logistic
regression results from each single group to predict project success for the other two remaining groups combined. The results show that
there is a reasonable degree of generalizability among the different groups.
Context: Software has been developed since the 1960s but the success rate of software development projects is still low. During the development of software, the probability of success is affected by various practices or aspects. To date, it is not clear which of these aspects are more important in influencing project
outcome.
Objective: In this research, we identify aspects which could influence project success, build prediction models based on the aspects using data collected from multiple companies, and then test their performance on data from a single organization.
Method: A survey-based empirical investigation was used to examine variables and factors that contribute to project outcome. Variables that were highly correlated to project success were selected and the set of variables was reduced to three factors by using principal components analysis. A logistic regression model was built for both the set of variables and the set of factors, using heterogeneous data collected from two different countries and a variety of organizations. We tested these models by using a homogeneous
hold-out dataset from one organization. We used the receiver operating characteristic (ROC) analysis to compare the performance of the variable and factor-based models when applied to the homogeneous dataset.
Results: We found that using raw variables or factors in the logistic regression models did not make any significant difference in predictive capability. The prediction accuracy of these models is more balanced when the cut-off is set to the ratio of success to failures in the datasets used to build the models. We
found that the raw variable and factor-based models predict significantly better than random chance.
Conclusion: We conclude that an organization wishing to estimate whether a project will succeed or fail may use a model created from heterogeneous data derived from multiple organizations.