This repository was archived by the owner on Jan 11, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 9
Expand file tree
/
Copy pathArxivEngineering.txt
More file actions
executable file
·4007 lines (4006 loc) · 487 KB
/
ArxivEngineering.txt
File metadata and controls
executable file
·4007 lines (4006 loc) · 487 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
If there are any errors
please Abort, and run `arxiv_required` for required package installation, and start again
Please wait while we phrase the requested information from global arxiv[arxiv.org] servers
------------>
---------------------------->
------------------------------------------------------>
Secure Deep Learning Engineering: A Software Quality Assurance Perspective (Lei Ma - 10 October, 2018)
This brings up many open safety and security issues with enormous and urgent demands on rigorous methodologies and engineering practice for quality enhancement. In this paper, we perform a large-scale study and construct a paper repository of 223 relevant works to the quality assurance, security, and interpretation of deep learning
Link: https://arxiv.org/abs/1810.04538
====================================================
On Tracking the Physicality of Wi-Fi: A Subspace Approach (Mohammed Alloulah - 9 October, 2018)
Our results demonstrate that proposed channel statistics alone can robustly reproduce state of the art application-specific feature engineering baseline, however, across multiple usage scenarios
Link: https://arxiv.org/abs/1810.04302
====================================================
Memetic Viability Evolution for Constrained Optimization (A. Maesani - 5 October, 2018)
The proposed algorithm can outperform several state-of-the-art methods on a diverse set of benchmark and engineering problems, both for quality of solutions and computational resources needed.
Link: https://arxiv.org/abs/1810.02702
====================================================
Towards Better Summarizing Bug Reports with Crowdsourcing Elicited Attributes (He Jiang - 28 September, 2018)
Then, we propose a new method named Crowd-Attribute to infer new effective attributes from the crowdgenerated data in crowdsourcing and develop a new tool named Crowdsourcing Software Engineering Platform to facilitate this method. With Crowd-Attribute, we successfully construct 11 new attributes and propose a new supervised algorithm named Logistic Regression with Crowdsourced Attributes (LRCA). Experiments over both the public data set SDS with 36 manually annotated bug reports and new large-scale data sets demonstrate that LRCA can consistently outperform the state-of-the-art algorithms for bug report summarization.
Link: https://arxiv.org/abs/1810.00125
====================================================
Reuse and Adaptation for Entity Resolution through Transfer Learning (Saravanan Thirumuruganathan - 28 September, 2018)
Considerable human effort goes into feature engineering and training data creation. We have performed comprehensive experiments on 12 datasets from 5 different domains (publications, movies, songs, restaurants, and books)
Link: https://arxiv.org/abs/1809.11084
====================================================
SAIL: Machine Learning Guided Structural Analysis Attack on Hardware Obfuscation (Prabuddha Chakraborty - 27 September, 2018)
Obfuscation is a technique for protecting hardware intellectual property (IP) blocks against reverse engineering, piracy, and malicious modifications. Evaluation on benchmark circuits show that we can recover an average of around 84% (up to 95%) transformations introduced by obfuscation
Link: https://arxiv.org/abs/1809.10743
====================================================
Morphed Learning: Towards Privacy-Preserving for Deep Learning Based Applications (Juncheng Shen - 20 September, 2018)
Morphed Learning has these three features: (1) Strong protection against reverse-engineering on the morphed data; (2) Acceptable computational and data transmission overhead with no correlation to the depth of the neural network; (3) No degradation of the neural network performance. Theoretical analyses on CIFAR-10 dataset and VGG-16 network show that our method is capable of providing 10^89 morphing possibilities with only 5% computational overhead and 10% transmission overhead under limited knowledge attack scenario
Link: https://arxiv.org/abs/1809.09968
====================================================
The Essence Theory of Software Engineering - Large-Scale Classroom Experiences from 450+ Software Engineering BSc Students (Kai-Kristian Kemell - 24 September, 2018)
To this end, we observe 102 student teams utilize Essence in practical software engineering projects during a semester long, project-based course.
Link: https://arxiv.org/abs/1809.08827
====================================================
OpenMPL: An Open Source Layout Decomposer (Qi Sun - 20 September, 2018)
However, due to the complicated design flow, heavy engineering effort for integration and tuning is required to reproduce them, raising the bar for further advancing the field. This paper presents OpenMPL [1], an open-source multiple patterning layout decomposition framework, with efficient implementations of various state-of-the-art algorithms
Link: https://arxiv.org/abs/1809.07554
====================================================
Fast embedding of multilayer networks: An algorithm and application to group fMRI (James D. Wilson - 17 September, 2018)
Motivated by this problem, we introduce the multi-node2vec algorithm, an efficient and scalable feature engineering method that automatically learns continuous node feature representations from multilayer networks. We demonstrate the efficacy of multi-node2vec on a multilayer functional brain network from resting state fMRI scans over a group of 74 healthy individuals
Link: https://arxiv.org/abs/1809.06437
====================================================
Autonomous drone race: A computationally efficient vision-based navigation and control strategy (S. Li - 16 September, 2018)
Finally, the whole system is tested in a complex environment (a showroom in the faculty of Aerospace Engineering, TU Delft). The result shows that the drone can complete the track of 15 gates with a speed of 1.5m/s which is faster than the speeds exhibited at the 2016 and 2017 IROS autonomous drone races.
Link: https://arxiv.org/abs/1809.05958
====================================================
An investigation of a deep learning based malware detection system (Mohit Sewak - 16 September, 2018)
But these results were done using extensive man-made custom domain features and investing corresponding feature engineering and design efforts. In our proposed approach, besides improving the previous best results (99.21% accuracy and a False Positive Rate of 0.19%) indicates that Deep Learning based systems could deliver an effective defense against malware
Link: https://arxiv.org/abs/1809.05888
====================================================
Spatial Configuration of Agile Wireless Networks with Drone-BSs and User-in-the-loop (Irem Bor-Yaliniz - 14 September, 2018)
Agile networking can reduce over-engineering, costs, and energy waste. Numerical results show that semi-joint SNC is two orders of magnitude times faster than joint SNC, and more than 15 percent profit can be obtained compared to conventional systems.
Link: https://arxiv.org/abs/1809.05315
====================================================
Real-time force control of an SEA-based body weight support unit with the 2-DOF control structure (Yubo Sun - 11 September, 2018)
Along with the dramatic progressing of rehabilitation science and engineering, BWS is quickly evolving with new initiatives and has attracted deep research effort in recent years. Further, the 2 degrees of freedom (2-DOF) control approach was taken for accurate and robust BWS force control
Link: https://arxiv.org/abs/1809.03826
====================================================
Flow Length and Size Distributions in Campus Internet Traffic (Piotr Jurkiewicz - 11 September, 2018)
For example, in case of traffic engineering mechanisms which base on the distinction between elephant and mice flows it is extremely important to ensure realistic distributions of flows' length (in packets) and size (in bytes). The statistics were calculated based on the real traffic traces comprising 4 billions of flows and collected at the Internet-facing interface of campus network
Link: https://arxiv.org/abs/1809.03486
====================================================
Analysis of the generalization error: Empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations (Julius Berner - 9 September, 2018)
Kolmogorov PDEs have been widely used in models from engineering, finance, and the natural sciences. In particular we show that for Kolmogorov PDEs with affine drift and diffusion coefficients and a given accuracy $\varepsilon>0$, ERM over deep neural network hypothesis classes of size scaling polynomially in the dimension $d$ and $\varepsilon^{-1}$ and with a number of training samples scaling polynomially in the dimension $d$ and $\varepsilon^{-1}$ approximates the solution of the Kolmogorov PDE to within accuracy $\varepsilon$ with high probability
Link: https://arxiv.org/abs/1809.03062
====================================================
Coupled IGMM-GANs for deep multimodal anomaly detection in human mobility data (Kathryn Gray - 7 September, 2018)
In this paper we address two challenges that arise in the study of anomalous human trajectories: 1) a lack of ground truth data on what defines an anomaly and 2) the dependence of existing methods on significant pre-processing and feature engineering
Link: https://arxiv.org/abs/1809.02728
====================================================
Gender differences in research areas and topics: An analysis of publications in 285 fields (Mike Thelwall - 4 September, 2018)
Prior research suggests that the imbalances between science, technology, engineering and mathematics fields may be partly due to greater male interest in things and greater female interest in people, or to off-putting masculine cultures in some disciplines. To seek more detailed insights across all subjects, this article compares practising US male and female researchers between and within 285 narrow Scopus fields inside 26 broad fields from their first-authored articles published in 2017
Link: https://arxiv.org/abs/1809.01255
====================================================
Software Professionals' Attitudes towards Video as a Medium in Requirements Engineering (Oliver Karras - 4 September, 2018)
In requirements engineering (RE), knowledge is mainly communicated via written specifications. 64 out of 106 software professionals completed the survey. 59 of them stated that video has the potential to improve RE. However, 34 respondents also mentioned threats of videos for RE
Link: https://arxiv.org/abs/1809.00804
====================================================
Machine learning for predicting thermal power consumption of the Mars Express Spacecraft (Matej PetkoviÄ - 3 September, 2018)
In particular, we employ state-of-the-art feature engineering approaches for transforming raw telemetry data, in turn used for constructing accurate models with different state-of-the-art machine learning methods
Link: https://arxiv.org/abs/1809.00542
====================================================
Neural Ranking Models for Temporal Dependency Structure Parsing (Yuchen Zhang - 2 September, 2018)
It utilizes a neural ranking model with minimal feature engineering, and parses time expressions and events in a text into a temporal dependency tree structure. In a parsing-only evaluation setup where gold time expressions and events are provided, our parser reaches 0.81 and 0.70 f-score on unlabeled and labeled parsing respectively, a result that is very competitive against alternative approaches
Link: https://arxiv.org/abs/1809.00370
====================================================
Use of Source Code Similarity Metrics in Software Defect Prediction (Ahmet Okutan - 29 August, 2018)
In recent years, defect prediction has received a great deal of attention in the empirical software engineering world. Our experiments on 10 open source data sets show that depending on the amount of detected similarity, proposed metrics could achieve significantly better performance compared to the existing static code metrics in terms of the area under the curve (AUC).
Link: https://arxiv.org/abs/1808.10033
====================================================
Proceedings of the 5th International Workshop on Software Engineering Methods in Spreadsheets (SEMS'18) (Birgit Hofer - 28 August, 2018)
Proceedings of the 5th International Workshop on Software Engineering Methods in Spreadsheets (SEMS'18), held on October 1st, 2018, in Lisbon, Portugal, and co-located with the 2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).
Link: https://arxiv.org/abs/1808.09174
====================================================
Disfluency Detection using Auto-Correlational Neural Networks (Paria Jamshid Lou - 27 August, 2018)
In recent years, the natural language processing community has moved away from task-specific feature engineering, i.e., researchers discovering ad-hoc feature representations for various tasks, in favor of general-purpose methods that learn the input representation by themselves. In experiments, the ACNN model outperforms the baseline CNN on a disfluency detection task with a 5% increase in f-score, which is close to the previous best result on this task.
Link: https://arxiv.org/abs/1808.09092
====================================================
A strong baseline for question relevancy ranking (Ana V. González-Garduño - 27 August, 2018)
The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks -- a task that amounts to question relevancy ranking -- involve complex pipelines and manual feature engineering. We present a strong baseline for question relevancy ranking by training a simple multi-task feed forward network on a bag of 14 distance measures for the input question pair
Link: https://arxiv.org/abs/1808.08836
====================================================
Ensemble Learning Applied to Classify GPS Trajectories of Birds into Male or Female (Dewan Fayzur - 26 August, 2018)
We used ensemble of several variants of Gradient Boosting Classifier along with Gaussian Process Classifier and Support Vector Classifier after extensive feature engineering and we ranked first out of 74 registered teams
Link: https://arxiv.org/abs/1808.08613
====================================================
BinMatch: A Semantics-based Hybrid Approach on Binary Code Clone Analysis (Yikun Hu - 19 August, 2018)
Binary code clone analysis is an important technique which has a wide range of applications in software engineering (e.g., plagiarism detection, bug detection). We evaluate BinMatch with eight real-world projects compiled with different compilation configurations and commonly-used obfuscation methods, totally performing over 100 million pairs of function comparison
Link: https://arxiv.org/abs/1808.06216
====================================================
Evaluation of team dynamic in Norwegian projects for IT students (Salah Uddin Ahmed - 14 August, 2018)
We performed a large-scale research on student performance in Software Engineering projects in Norwegian universities. Data was collected from student projects in 4 years at two universities
Link: https://arxiv.org/abs/1808.05473
====================================================
Software Professionals are Not Directors: What Constitutes a Good Video? (Oliver Karras - 15 August, 2018)
Despite 35 years of research on integrating videos in requirements engineering (RE), videos are not an established documentation option in terms of RE best practices
Link: https://arxiv.org/abs/1808.04986
====================================================
Deep EHR: Chronic Disease Prediction Using Medical Notes (Jingshu Liu - 14 August, 2018)
We compareperformance of different deep learning architectures including CNN, LSTM and hierarchical models.In contrast to traditional text-based prediction models, our approach does not require disease specificfeature engineering, and can handle negations and numerical values that exist in the text. Ourresults on a cohort of about 1 million patients show that models using text outperform modelsusing just structured data, and that models capable of using numerical values and negations in thetext, in addition to the raw text, further improve performance
Link: https://arxiv.org/abs/1808.04928
====================================================
Multimodal Deep Neural Networks using Both Engineered and Learned Representations for Biodegradability Prediction (Garrett B. Goh - 13 September, 2018)
In this work, we develop a novel multimodal CNN-MLP neural network architecture that utilizes both domain-specific feature engineering as well as learned representations from raw data. DeepBioD, a multimodal CNN-MLP network is more accurate than either standalone network designs, and achieves an error classification rate of 0.125 that is 27% lower than the current state-of-the-art
Link: https://arxiv.org/abs/1808.04456
====================================================
Large-Scale Study of Curiosity-Driven Learning (Yuri Burda - 13 August, 2018)
Reinforcement learning algorithms rely on carefully engineering environment rewards that are extrinsic to the agent. without any extrinsic rewards, across 54 standard benchmark environments, including the Atari game suite
Link: https://arxiv.org/abs/1808.04355
====================================================
A Constrained Shortest Path Scheme for Virtual Network Service Management (Dmitrii Chemodanov - 9 August, 2018)
Successful virtual network service composition and maintenance requires flexible and scalable 'constrained shortest path management' both in the management plane for virtual network embedding (VNE) or network function virtualization service chaining (NFV-SC), as well as in the data plane for traffic engineering (TE). In this paper, we show analytically and empirically that leveraging constrained shortest paths within recent VNE, NFV-SC and TE algorithms can lead to network utilization gains (of up to 50%) and higher energy efficiency
Link: https://arxiv.org/abs/1808.03031
====================================================
Learning to Focus when Ranking Answers (Dana Sagi - 8 August, 2018)
To achieve this representation, the conventional state of the art approaches perform extensive feature engineering that encode the similarity of the query-answer pair
Link: https://arxiv.org/abs/1808.02724
====================================================
SWDE : A Sub-Word And Document Embedding Based Engine for Clickbait Detection (Vaibhav Kumar - 2 August, 2018)
Initial methods for this task were dependent on feature engineering, which varies with each dataset. We test our model over 2538 posts (having trained it on 17000 records) and achieve an accuracy of 83.49% outscoring previous state-of-the-art approaches.
Link: https://arxiv.org/abs/1808.00957
====================================================
Leveraging Knowledge Graph Embedding Techniques for Industry 4.0 Use Cases (Martina Garofalo - 31 July, 2018)
However, machine learning directly on graphs, needs feature engineering and has scalability issues. In this paper we discuss methods to convert (embed) the graph in a vector space, such that it becomes feasible to use traditional machine learning methods for Industry 4.0 settings.
Link: https://arxiv.org/abs/1808.00434
====================================================
Is One Hyperparameter Optimizer Enough? (Huy Tu - 2 October, 2018)
While widely applied in empirical Software Engineering, there has not been much discussion on which hyperparameter tuner is best for software analytics. Surprisingly, no hyperparameter optimizer was observed to be `best' and, for one of the two evaluation measures studied here (F-measure), hyperparameter optimization, in 50\% cases, was no better than using default configurations.
Link: https://arxiv.org/abs/1807.11112
====================================================
Predicting purchasing intent: Automatic Feature Learning using Recurrent Neural Networks (Humphrey Sheil - 21 July, 2018)
Our main contribution is to address the significant investment in feature engineering that is usually associated with state-of-the-art methods such as Gradient Boosted Machines. Results on benchmark datasets deliver classification accuracy within 98% of state-of-the-art on one and exceed state-of-the-art on the second without the need for any domain / dataset-specific feature engineering on both short and long event sequences.
Link: https://arxiv.org/abs/1807.08207
====================================================
Loud and Interactive Paper Prototyping in Requirements Elicitation: What is it Good for? (Zahra Shakeri Hossein Abad - 19 July, 2018)
Requirements Engineering is a multidisciplinary and a human-centered process, therefore, the artifacts produced from RE are always error-prone. To this end, we conducted a case study of (1) 31 mobile application (App) development teams who applied either of interactive or loud prototyping and (2) 19 mobile App development teams who applied only the face-to-face meetings
Link: https://arxiv.org/abs/1807.07662
====================================================
Real-Time Stereo Vision for Road Surface 3-D Reconstruction (Rui Fan - 28 August, 2018)
Stereo vision techniques have been widely used in civil engineering to acquire 3-D road data. The proposed algorithm is implemented on an NVIDIA GTX 1080 GPU for the real-time purpose. The experimental results illustrate that the reconstruction accuracy is around 3 mm.
Link: https://arxiv.org/abs/1807.07433
====================================================
Clinical Text Classification with Rule-based Features and Knowledge-guided Convolutional Neural Networks (Liang Yao - 20 July, 2018)
Existing studies have conventionally focused on rules or knowledge sources-based feature engineering, but only a few have exploited effective feature learning capability of deep learning methods. We evaluated our method on the 2008 Integrating Informatics with Biology and the Bedside (i2b2) obesity challenge
Link: https://arxiv.org/abs/1807.07425
====================================================
Development of SageMath filter for Moodle (Yevhenii O. Modlo - 14 July, 2018)
Research objectives: to prove the feasibility of using Moodle system as a tool to support the process of competency formation in technical objects simulation of future bachelors in electromechanical engineering; to analyze existing support tools of technical objects simulation and to identify the ways of it's integration into Moodle; to describe the structure and features of the software implementation of the new SageMath filter for Moodle; to provide the guidance on installing and configuring developed filter; to describe the examples of filter usage. The main conclusions and recommendations: 1. 2. 3
Link: https://arxiv.org/abs/1807.06924
====================================================
$h_{PI}$: The Citation Index for Principal Investigators (Christoph Steinbrüchel - 17 July, 2018)
Data are presented for a sample of 48 PIs who are senior faculty members of physics and physics-related engineering departments at a private research-oriented U.S. Also, to a good approximation across the sample of 48 PIs, one finds that $h_{PI} = h \,/ \sqrt{<N_{PI}>}$ where <$N_{PI}$> is the average number of principal investigators on the papers of a particular PI. In addition, $h_{PI} = \frac{1}{2} \sqrt{C_{tot}\,/<N_{PI}>}$, where $C_{tot}$ is the total number of citations
Link: https://arxiv.org/abs/1807.06442
====================================================
The Effect of Noise on Sofware Engineers' Performance (Simone Romano - 11 July, 2018)
The software engineering research community has marginally investigated the effects of noise on software engineers' performance. In the first experiment, we asked 55 students to comprehend functional requirements exposing them or not to noise, while in the second experiment 42 students were asked to fix faults in Java code
Link: https://arxiv.org/abs/1807.04100
====================================================
The CodRep Machine Learning on Source Code Competition (Zimin Chen - 6 July, 2018)
In particular, it aims at being a common playground on which the machine learning and the software engineering research communities can interact. The competition starts on April 14th 2018 and ends on October 14th 2018
Link: https://arxiv.org/abs/1807.03200
====================================================
Cultural Influences on Requirements Engineering Process in the Context of Saudi Arabia (Tawfeeq Alsanoosy - 5 July, 2018)
Software development requires intensive communication between the requirements engineers and software stakeholders, particularly during the Requirements Engineering (RE) phase. The results reveal 6 RE aspects and 10 cultural factors that have a large impact on the RE practice.
Link: https://arxiv.org/abs/1807.01930
====================================================
Uncertainty Quantification of Electronic and Photonic ICs with Non-Gaussian Correlated Process Variations (Chunfeng Cui - 30 June, 2018)
Since the invention of generalized polynomial chaos in 2002, uncertainty quantification has impacted many engineering fields, including variation-aware design automation of integrated circuits and integrated photonics
Link: https://arxiv.org/abs/1807.01778
====================================================
COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks (Piero Molino - 3 July, 2018)
Two machine learning and natural language processing techniques are demonstrated: one relying on feature engineering (COTA v1) and the other exploiting raw signals through deep learning architectures (COTA v2). Finally, an A/B test is conducted in a production setting validating the real-world impact of COTA in reducing issue resolution time by 10 percent without reducing customer satisfaction.
Link: https://arxiv.org/abs/1807.01337
====================================================
Machine learning 2.0 : Engineering Data Driven AI Products (James Max Kanter - 1 July, 2018)
ML 2.0: In this paper, we propose a paradigm shift from the current practice of creating machine learning models - which requires months-long discovery, exploration and "feasibility report" generation, followed by re-engineering for deployment - in favor of a rapid, 8-week process of development, understanding, validation and deployment that can executed by developers or subject matter experts (non-ML experts) using reusable APIs
Link: https://arxiv.org/abs/1807.00401
====================================================
Product-based Neural Networks for User Response Prediction over Multi-field Categorical Data (Yanru Qu - 1 July, 2018)
Due to the sparsity problems in representation and optimization, most research focuses on feature engineering and shallow modeling. Extensive experiments on 4 industrial datasets and 1 contest dataset demonstrate that our models consistently outperform 8 baselines on both AUC and log loss. Besides, PIN makes great CTR improvement (relatively 34.67%) in online A/B test.
Link: https://arxiv.org/abs/1807.00311
====================================================
A New Benchmark and Progress Toward Improved Weakly Supervised Learning (Jason Ramapuram - 18 September, 2018)
Knowledge Matters: Importance of Prior Information for Optimization [7], by Gulcehre et al.., sought to establish the limits of current black-box, deep learning techniques by posing problems which are difficult to learn without engineering knowledge into the model or training procedure. We present results on All-Pairs where our model achieves 100% test accuracy while the best ResNet models achieve 79% accuracy
Link: https://arxiv.org/abs/1807.00126
====================================================
BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing (Yaman Umuroglu - 22 June, 2018)
Matrix-matrix multiplication is a key computational kernel for numerous applications in science and engineering, with ample parallelism and data locality that lends itself well to high-performance implementations. We characterize the resource usage and performance of BISMO across a range of parameters to build a hardware cost model, and demonstrate a peak performance of 6.5 TOPS on the Xilinx PYNQ-Z1 board.
Link: https://arxiv.org/abs/1806.08862
====================================================
Radial Basis Function Approximations: Comparison and Applications (Zuzana Majdisova - 20 June, 2018)
Approximation of scattered data is often a task in many engineering problems. This approach is useful for a higher dimension d>2, because the other methods require the conversion of a scattered dataset to an ordered dataset (i.e
Link: https://arxiv.org/abs/1806.07705
====================================================
A Simple Fusion of Deep and Shallow Learning for Acoustic Scene Classification (Eduardo Fonseca - 27 June, 2018)
In this paper, we propose a system that consists of a simple fusion of two methods of the aforementioned types: a deep learning approach where log-scaled mel-spectrograms are input to a convolutional neural network, and a feature engineering approach, where a collection of hand-crafted features is input to a gradient boosting machine. We report classification accuracy of each method individually and the combined system on the TUT Acoustic Scenes 2017 dataset. The proposed fused system outperforms each of the individual methods and attains a classification accuracy of 72.8% on the evaluation set, improving the baseline system by 11.8%.
Link: https://arxiv.org/abs/1806.07506
====================================================
Hyper Space Exploration - A Multicriterial Quantitative Trade-Off Analysis for System Design in Complex Environment (Herbert Palm - 14 June, 2018)
While dealing with complicated issues has become an engineering standard mastering uncertainties in complex environment is still a major issue. Missing experience in unknown technological territory faces engineers with two questions of paramount importance: 1) How can the best architectural solution within the space of potential alternatives be identified? 2) How can a proof-of-concept for considered solutions prior implementation be provided? Mastering lack of knowledge related risks and uncertainties states one of the most prominent tasks in according projects
Link: https://arxiv.org/abs/1806.05950
====================================================
ServeNet: A Deep Neural Network for Web Service Classification (Yilong Yang - 14 June, 2018)
However, they can only predict around 10 to 20 service categories due to the quality of feature engineering and the imbalance problem of service dataset. To demonstrate the effectiveness of our approach, we conducted a comprehensive experimental study on 10,000 real-world services in 50 categories
Link: https://arxiv.org/abs/1806.05437
====================================================
Detecting Speech Act Types in Developer Question/Answer Conversations During Bug Repair (Andrew Wood - 3 July, 2018)
The key application of this work is to advance the state of the art for virtual assistants in software engineering
Link: https://arxiv.org/abs/1806.05130
====================================================
A Product Line Systems Engineering Process for Variability Identification and Reduction (Mole Li - 12 June, 2018)
To evaluate the effectiveness of the process in the reduction of variation points, it is further applied to case studies in different engineering domains at different levels of complexity. Subject to system model availability, reduction of 14% to 40% in the number of variation points are demonstrated in the case studies.
Link: https://arxiv.org/abs/1806.04705
====================================================
Engaging Millennials into Learning Formal Methods (Néstor Cataño - 9 June, 2018)
This paper summarizes our experience in teaching courses on formal methods (FM) to Computer Science (CS) and Software Engineering (SE) students at various universities around the world, including University of Madeira (UMa) in Portugal, Pontificia Universidad Javeriana (PUJ) and University of Los Andes (Uniandes) in Colombia, Carnegie Mellon University (CMU) in the USA, and at Innopolis University (INNO) in the Russian Federation. We report challenges faced during the past 10 to 15 years to teach FM to millennials undergradu- ate and graduate students and describe how we have coped with those challenges
Link: https://arxiv.org/abs/1806.03527
====================================================
Feature Pyramid Network for Multi-Class Land Segmentation (Selim S. Seferbekov - 19 June, 2018)
Solving it can help to overcome many obstacles in urban planning, environmental engineering or natural landscape monitoring. Based on validation results, leaderboard score and our own experience this network shows reliable results for the DEEPGLOBE - CVPR 2018 land cover classification sub-challenge. Moreover, this network moderately uses memory that allows using GTX 1080 or 1080 TI video cards to perform whole training and makes pretty fast predictions.
Link: https://arxiv.org/abs/1806.03510
====================================================
Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale (Peter W J Staar - 24 May, 2018)
The CCS platform is currently deployed on IBM internal infrastructure and serving more than 250 active users for knowledge-engineering project engagements.
Link: https://arxiv.org/abs/1806.02284
====================================================
A Systematic Mapping Study on Security in Agile Requirements Engineering (H. Villamizar - 4 June, 2018)
[Background] The rapidly changing business environments in which many companies operate is challenging traditional Requirements Engineering (RE) approaches. [Results] In total, we identified 21 studies that met our inclusion criteria, dated from 2005 to 2017
Link: https://arxiv.org/abs/1806.01366
====================================================
Large-Scale Neuromorphic Spiking Array Processors: A quest to mimic the brain (Chetan Singh Thakur - 22 May, 2018)
The brain has evolved over billions of years to solve difficult engineering problems by using efficient, parallel, low-power computation. This interdisciplinary field was listed among the top 10 technology breakthroughs of 2014 by the MIT Technology Review and among the top 10 emerging technologies of 2015 by the World Economic Forum
Link: https://arxiv.org/abs/1805.08932
====================================================
Are Computer Science and Engineering Graduates Ready for the Software Industry? Experiences from an Industrial Student Training Program (Eray Tuzun - 22 May, 2018)
It has been 50 years since the term software engineering was coined in 1968 at a NATO conference. We support this insight with pre- and post-training data collected from the participants during the first edition of such a summer school and a follow-up questionnaire conducted after a year with the graduates, 50% of whom was hired by the company shortly after the summer school.
Link: https://arxiv.org/abs/1805.08894
====================================================
Fast Symbolic 3D Registration Solution (Jin Wu - 12 May, 2018)
3D registration has always been performed invoking singular value decomposition (SVD) or eigenvalue decomposition (EIG) in real engineering practices. Experimental results show that the proposed solver does not loose accuracy and robustness but improves the execution speed to a large extent by almost \%50 to \%80, on both personal computer and embedded processor.
Link: https://arxiv.org/abs/1805.08703
====================================================
Status Quo in Requirements Engineering: A Theory and a Global Family of Surveys (Stefan Wagner - 14 September, 2018)
Requirements Engineering (RE) has established itself as a software engineering discipline during the past decades. We designed a survey instrument and theory that has now been replicated in 10 countries world-wide. We report on the underlying theory and the full results obtained from the replication studies with participants from 228 organisations
Link: https://arxiv.org/abs/1805.07951
====================================================
DroidMark: A Tool for Android Malware Detection using Taint Analysis and Bayesian Network (Dhruv Rathi - 15 June, 2018)
The detection system named DroidMark looks for possible sinks and sources of data leakage in the application by modelling Android lifecycle and callbacks, which is done by Reverse Engineering the APK, further monitoring the suspected processes and collecting data in different states of the application. The results indicate a high accuracy of 96.87% and an error rate of 3.13% in the detection of Malware in Android devices.
Link: https://arxiv.org/abs/1805.06620
====================================================
Automated Vision-based Bridge Component Extraction Using Multiscale Convolutional Neural Networks (Yasutaka Narazaki - 15 May, 2018)
Image data has a great potential of helping post-earthquake visual inspections of civil engineering structures due to the ease of data acquisition and the advantages in capturing visual information. The bridge component recognition begins with pixel-wise classifications of an image into 10 scene classes
Link: https://arxiv.org/abs/1805.06042
====================================================
Vision-based Automated Bridge Component Recognition Integrated With High-level Scene Understanding (Yasutaka Narazaki - 15 May, 2018)
Image data has a great potential of helping conventional visual inspections of civil engineering structures due to the ease of data acquisition and the advantages in capturing visual information. To reduce false-positives and get consistent labels, the component classifications are integrated with scene understanding by an additional classifier with 10 higher-level scene classes (building, greenery, person, pavement, signs and poles, vehicles, bridges, water, sky, and others)
Link: https://arxiv.org/abs/1805.06041
====================================================
A Chaos Engineering System for Live Analysis and Falsification of Exception-handling in the JVM (Long Zhang - 14 May, 2018)
In this paper, we propose a novel design and implementation of a chaos engineering system in Java called CHAOSMACHINE. To evaluate our approach, we have deployed CHAOSMACHINE on top of 3 large-scale and well-known Java applications totaling 630k lines of code
Link: https://arxiv.org/abs/1805.05246
====================================================
Fork and Join Queueing Networks with Heavy Tails: Scaling Dimension and Throughput Limit (Yun Zeng - 14 May, 2018)
While engineering solutions have long been made to build and scale such systems, it is challenging to rigorously characterize their throughput performance at scale theoretically. We investigate throughput scalability by focusing on heavy-tailed service times that are regularly varying (with index $α>1$) and featuring the network topology described by the two aforementioned dimensions
Link: https://arxiv.org/abs/1805.05197
====================================================
Deep Learning in Software Engineering (Xiaochen Li - 13 May, 2018)
Recent years, deep learning is increasingly prevalent in the field of Software Engineering (SE). To answer these questions, we conduct a bibliography analysis on 98 research papers in SE that use deep learning techniques. We find that 41 SE tasks in all SE phases have been facilitated by deep learning integrated solutions. In which, 84.7% papers only use standard deep learning models and their variants to solve SE problems
Link: https://arxiv.org/abs/1805.04825
====================================================
Proceedings Joint Workshop on Handling IMPlicit and EXplicit knowledge in formal system development (IMPEX) and Formal and Model-Driven Techniques for Developing Trustworthy Systems (FM&MDD) (Régine Laleau - 11 May, 2018)
This volume contains the joint proceedings of IMPEX 2017, the first workshop on Handling IMPlicit and EXplicit knowledge in formal system development and FM&MDD, the second workshop on Formal and Model-Driven Techniques for Developing Trustworthy Systems (FM&MDD) held together on November 16, 2017 in Xi'an, China, as part of ICFEM 2017, 19th International Conference on Formal Engineering Methods.
Link: https://arxiv.org/abs/1805.04636
====================================================
Multi-View Semantic Labeling of 3D Point Clouds for Automated Plant Phenotyping (Bernhard Japes - 29 May, 2018)
By evaluating our approach with challenging datasets we achieve state-of-the-art results without difficult and time consuming feature engineering as being necessary in traditional approaches to semantic labeling.
Link: https://arxiv.org/abs/1805.03994
====================================================
Human Capital in Software Engineering: A Systematic Mapping of Reconceptualized Human Aspect Studies (Saya Onoue - 10 May, 2018)
In this study, we reconceptualize human aspects of software engineering (SE) into a framework (i.e., SE human capital). From premium SE publishing venues (five journal articles and four conferences), we extract 2,698 hits of papers published between 2013 to 2017. Using a search criteria, we then narrow our results to 340 papers. Finally, we use inclusion and exclusion criteria to manually select 78 papers (49 quantitative and 29 qualitative studies)
Link: https://arxiv.org/abs/1805.03844
====================================================
Fifty Years of Software Engineering - or - The View from Garmisch (Brian Randell - 7 May, 2018)
On several earlier anniversaries of the 1968-69 NATO Software Engineering conferences I have acceded to requests to provide some reminiscences. But some large software projects in the latter bespoke category still suffer from problems that are all too reminiscent of those that, in 1968, gave rise to discussion of a "software crisis".
Link: https://arxiv.org/abs/1805.02742
====================================================
#ILookLikeAnEngineer: Using Social Media Based Hashtag Activism Campaigns as a Lens to Better Understand Engineering Diversity Issues (Aqdas Malik - 4 May, 2018)
Each year, significant investment of time and resources is made to improve diversity within engineering across a range of federal and state agencies, private/not-for-profit organizations, and foundations. Almost 87% of the American population now participates in some form of social media activity
Link: https://arxiv.org/abs/1805.01971
====================================================
Using Multi Expression Programming in Software Effort Estimation (Najla Akram - 30 April, 2018)
Estimating the effort of software systems is an essential topic in software engineering, carrying out an estimation process reliably and accurately for a software forms a vital part of the software development phases. Results show that MEP is far better in discovering effective functions for the estimation of about 6 datasets each comprising several projects.
Link: https://arxiv.org/abs/1805.00090
====================================================
How Diverse Users and Activities Trigger Connective Action via Social Media: Lessons from the Twitter Hashtag Campaign #ILookLikeAnEngineer (Aditya Johri - 24 April, 2018)
We present a study that examines how a social media activism campaign aimed at improving gender diversity within engineering gained and maintained momentum in its early period. We categorize these triggers into four types: 1) Event-Driven: Alignment of the campaign with offline events related to the issue (Diversity SFO, Disrupt, etc.); 2) Media-Driven: News coverage of the events in the media (TechCrunch, CNN, BBC, etc.); 3) Industry-Driven: Web participation in the campaign by large organizations (Microsoft, Tesla, GE, Cisco, etc.); and 4) Personality-Driven: Alignment of the events with popular and/or known personalities (e.g
Link: https://arxiv.org/abs/1804.09226
====================================================
Event Extraction with Generative Adversarial Imitation Learning (Tongtao Zhang - 20 April, 2018)
Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering.
Link: https://arxiv.org/abs/1804.07881
====================================================
An Integrated Development Environment for Planning Domain Modeling (Yuncong Li - 19 April, 2018)
However, current knowledge engineering tools with visual modeling, like itSIMPLE (Vaquero et al. 2012) and VIZ (Vodrážka and Chrpa 2010), are less efficient than the traditional method of hand-coding by a PDDL expert using a text editor, and rarely involved in finetuning planning domains depending on the plan validation
Link: https://arxiv.org/abs/1804.07013
====================================================
Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition (Qi Wang - 13 April, 2018)
Previous statistical methods and feature engineering practice have demonstrated that human knowledge can provide valuable information for handling rare and unseen cases. Computational results on the CCKS-2017 Task 2 benchmark dataset show that the proposed method achieves the highly competitive performance compared with the state-of-the-art deep learning methods.
Link: https://arxiv.org/abs/1804.05017
====================================================
DeepFM: An End-to-End Wide & Deep Learning Framework for CTR Prediction (Huifeng Guo - 16 May, 2018)
Compared to the latest Wide & Deep model from Google, DeepFM has a shared raw feature input to both its "wide" and "deep" components, with no need of feature engineering besides raw features. We conduct online A/B test in Huawei App Market, which reveals that DeepFM-D leads to more than 10% improvement of click-through rate in the production environment, compared to a well-engineered LR model
Link: https://arxiv.org/abs/1804.04950
====================================================
An Experimental Evaluation of a De-biasing Intervention for Professional Software Developers (Martin Shepperd - 11 April, 2018)
OBJECTIVE: We aimed to replicate this anchoring bias using professionals and, novel in a software engineering context, explore de-biasing interventions through increasing knowledge and awareness of judgement biases. METHOD: We ran two series of experiments in company settings with a total of 410 software developers
Link: https://arxiv.org/abs/1804.03919
====================================================
Software Engineering for Millennials, by Millennials (Jocelyn Simmonds - 5 April, 2018)
Our university offers a 5.5 year program that mixes computer science, software and computer engineering, where the first two years are mostly math and physics courses. We decided to redesign this course in 2017, trying to achieve a balance between theory and practice, and technical and professional skills, with a maximum course workload of 150 hrs per semester
Link: https://arxiv.org/abs/1804.03518
====================================================
Proactive Empirical Assessment of New Language Feature Adoption via Automated Refactoring: The Case of Java 8 Default Methods (Raffi Khatchadourian - 27 March, 2018)
Studying how developers use (or do not use) new language features is important in programming language research and engineering because it gives designers insight into the usability of the language to create meaning programs in that language. Here, we explore Java 8 default methods, which allow interfaces to contain (instance) method implementations.
Link: https://arxiv.org/abs/1803.10198
====================================================
SEAT: A Taxonomy to Characterize Automation in Software Engineering (Shipra Sharma - 26 March, 2018)
In this paper we present such a characterization of ASE tools and major constituent techniques from different areas of computer science and engineering that have been employed by such ASE tools. To develop the characterization we carried out an extensive systematic literature review over about 1175 ASE research articles
Link: https://arxiv.org/abs/1803.09536
====================================================
Poster: Communication in Open-Source Projects--End of the E-mail Era? (Verena Käfer - 26 March, 2018)
Communication is essential in software engineering. In this study, we fill the knowledge gap by investigating a statistically representative sample of 400 GitHub projects
Link: https://arxiv.org/abs/1803.09529
====================================================
Natural Language or Not (NLoN) - A Package for Software Engineering Text Analysis Pipeline (Mika V. Mäntylä - 20 March, 2018)
In order to correctly perform NLP, we must pre-process the textual information to separate natural language from other information, such as log messages, that are often part of the communication in software engineering. Although our NLoN package relies on only 11 language features and character tri-grams, we are able to achieve an area under the ROC curve performances between 0.976-0.987 on three different data sources, with Lasso regression from Glmnet as our learner and two human raters for providing ground truth. Cross-source prediction performance is lower and has more fluctuation with top ROC performances from 0.913 to 0.980
Link: https://arxiv.org/abs/1803.07292
====================================================
A Simple and Effective Approach to the Story Cloze Test (Siddarth Srinivasan - 14 March, 2018)
Following this approach, we present a simpler fully-neural approach to the Story Cloze Test using skip-thought embeddings of the stories in a feed-forward network that achieves close to state-of-the-art performance on this task without any feature engineering
Link: https://arxiv.org/abs/1803.05547
====================================================
An Unsupervised Model with Attention Autoencoders for Question Retrieval (Minghua Zhang - 9 March, 2018)
Previous research focus on supervised models which depend heavily on training data and manual feature engineering. The experimental results show that our unsupervised model obtains comparable performance with the state-of-the-art supervised methods in SemEval-2016 Task 3, and outperforms the best system in SemEval-2017 Task 3 by a wide margin.
Link: https://arxiv.org/abs/1803.03476
====================================================
Super Compaction and Pluripotent Shape Transformation via Algorithmic Stacking for 3D Deployable Structures (Zhonghua Xi - 8 March, 2018)
The huge initial dimension of the 2D flattened structure makes fabrication difficult, and defeats the main purpose, namely compactness, of many origami-inspired engineering. Depending on the surface thickness, the stacked structure takes merely 0.001% to 6% of the original volume
Link: https://arxiv.org/abs/1803.03302
====================================================
SysML/KAOS Domain Models and B System Specifications (Steve Jeffrey Tueno Fotso - 28 June, 2018)
In this paper, we use a combination of the SysML/KAOS requirements engineering method, an extension of SysML, with concepts of the KAOS goal model, and of the B System formal method. Finally, we provide a review of the application of the SysML/KAOS method on case studies such as for the formal specification of the hybrid ERTMS/ETCS level 3 standard.
Link: https://arxiv.org/abs/1803.01972
====================================================
Learning Flexible and Reusable Locomotion Primitives for a Microrobot (Brian Yang - 28 February, 2018)
The design of gaits for robot locomotion can be a daunting process which requires significant expert knowledge and engineering. Moreover, the experimental simulations show that without any prior knowledge about the robot used (e.g., dynamics model), our approach is capable of learning locomotion primitives within 250 trials and subsequently using them to successfully navigate through a maze.
Link: https://arxiv.org/abs/1803.00196
====================================================
Extractive Text Summarization using Neural Networks (Aakash Sinha - 27 February, 2018)
Traditional approaches to text summarization rely heavily on feature engineering. We train and evaluate the model on standard DUC 2002 dataset which shows results comparable to the state of the art models
Link: https://arxiv.org/abs/1802.10137
====================================================
Minimizing Flow Completion Times using Adaptive Routing over Inter-Datacenter Wide Area Networks (Mohammad Noormohammadpour - 25 February, 2018)
A popular approach widely used for traffic engineering is based on current bandwidth utilization of links. We propose an alternative that reduces bandwidth usage by up to at least 50% and flow completion times by up to at least 40% across various scheduling policies and flow size distributions.
Link: https://arxiv.org/abs/1802.09080
====================================================
SmartUnit: Empirical Evaluations for Automated Unit Testing of Embedded Software in Industry (Chengyu Zhang - 17 June, 2018)
To achieve the goal, by analyzing the industrial requirements and our previous work on automated unit testing tool CAUT, we rebuild a new tool, SmartUnit, to solve the engineering requirements that take place in our partner companies. From our experimental results, in general, more than 90% of functions in commercial embedded software achieve 100% statement, branch, MC/DC coverage, more than 80% of functions in SQLite achieve 100% MC/DC coverage, and more than 60% of functions in PostgreSQL achieve 100% MC/DC coverage
Link: https://arxiv.org/abs/1802.08547
====================================================
Statistical Software for Psychology: Comparing Development Practices Between CRAN and Other Communities (Spencer Smith - 20 February, 2018)
{\bf Method}: We compared and ranked 30 software tools with respect to adherence to best software engineering practices on items that could be measured by end-users
Link: https://arxiv.org/abs/1802.07362
====================================================
Bias Compensation in Iterative Soft-Feedback Algorithms with Application to (Discrete) Compressed Sensing (Susanne Sparrer - 20 February, 2018)
Although so-called soft feedback is widely employed in many different fields of engineering, typically the biased estimate is used. Numerical results show that when employed in iterative reconstruction algorithms for Compressed Sensing, a gain of 1.2 dB due to proper unbiasing is possible.
Link: https://arxiv.org/abs/1802.07105
====================================================
Formal Analysis of Galois Field Arithmetics - Parallel Verification and Reverse Engineering (Cunxi Yu - 16 February, 2018)
This paper presents a computer algebra technique that performs verification and reverse engineering of GF($2^m$) multipliers directly from the gate-level implementation. The approach is based on extracting a unique irreducible polynomial in a parallel fashion and proceeds in three steps: 1) determine the bit position of the output bits; 2) determine the bit position of the input bits; and 3) extract the irreducible polynomial used in the design
Link: https://arxiv.org/abs/1802.06870
====================================================
Full Virtualization of Renault's Engine Management Software and Application to System Development (Dirk Von Wissel - 16 February, 2018)
Domain: Critical Transportation Systems Topic: Processes, methods and tools, in particular: virtual engineering and simulation 1. Motivation Since 2010, Renault has established a framework to develop engine control software for Diesel and Gasoline engines [6]. In the Renault EMS architecture software is composed in to about 20 functions, such as Air System, Combustion etc. The Renault EMS development process includes basically the following steps [5]. 1. Specification of about 200 generic configurable modules per ECU using MATLAB/Simulink. 2. 3. To insure software quality, this step is repeatedly performed with steps 1 and 2, based on the simulation capabilities of MATLAB/Simulink. 4. 5. 6. In contrast to step 3, the interactions of all modules and interactions with the system environment are visible then and subject to testing. Critical assessment of the above process shows that there is a considerable delay between delivery of a set of specifications to the software project team (at the end of step 3) and system-level tests based on an ECU that runs entire software (step 6)
Link: https://arxiv.org/abs/1802.06841
====================================================
The Dangerous Dogmas of Software Engineering (Paul Ralph - 17 February, 2018)
This paper analyzes the nature and detrimental effects of four software engineering dogmas - 1) the belief that software has "requirements"; 2) the division of software engineering tasks into analysis, design, coding and testing; 3) the belief that software engineering is predominantly concerned with designing "software" systems; 4) the belief that software engineering follows methods effectively
Link: https://arxiv.org/abs/1802.06321
====================================================
Consensus in Software Engineering: A Cognitive Mapping Study (Pontus Johnson - 17 February, 2018)
Method: A convenience sample of 60 software engineering researchers produced diagrams describing their personal understanding of causal relationships between core software engineering constructs
Link: https://arxiv.org/abs/1802.06319
====================================================
Event Nugget Detection with Forward-Backward Recurrent Neural Networks (Reza Ghaeini - 15 February, 2018)
Recent deep learning approaches alleviate this problem by automatic feature engineering. Experimental results demonstrate that FBRNN is competitive with the state-of-the-art methods on the ACE 2005 and the Rich ERE 2015 event detection tasks.
Link: https://arxiv.org/abs/1802.05672
====================================================
500+ Times Faster Than Deep Learning (A Case Study Exploring Faster Methods for Text Mining StackOverflow) (Suvodeep Majumder - 14 February, 2018)
Deep learning methods are useful for high-dimensional data and are becoming widely used in many areas of software engineering. This approach is over 500 times faster than deep learning (and over 900 times faster if we use all the cores on a standard laptop computer)
Link: https://arxiv.org/abs/1802.05319
====================================================
Towards Generic Deobfuscation of Windows API Calls (Vadim Kotov - 13 February, 2018)
To complicate the reverse engineering of their programs, malware authors deploy API obfuscation techniques, hiding them from analysts' eyes and anti-malware scanners. Our best prediction model can correctly identify API names with 87.60% accuracy.
Link: https://arxiv.org/abs/1802.04466
====================================================
Buy your coffee with bitcoin: Real-world deployment of a bitcoin point of sale terminal (Shayan Eskandari - 12 February, 2018)
Following a requirements engineering approach, we designed, implemented a new Point of Sale (PoS) system that satisfies an optimal set of criteria within our evaluation framework. Our open source system, Aunja PoS, has been deployed in a real world cafe since October 2014.
Link: https://arxiv.org/abs/1802.04236
====================================================
Geodesic Convolutional Shape Optimization (Pierre Baqué - 12 February, 2018)
Existing methods, however, are so computationally demanding that typical engineering practices are to either simply try a limited number of hand-designed shapes or restrict oneself to shapes that can be parameterized using only few degrees of freedom. This outperforms state- of-the-art methods by 5 to 20% for standard problems and, even more importantly, our approach applies to cases that previous methods cannot handle.
Link: https://arxiv.org/abs/1802.04016
====================================================
FD-MobileNet: Improved MobileNet with a Fast Downsampling Strategy (Zheng Qin - 11 February, 2018)
(iii) It is engineering-friendly and provides fast actual inference speed. Experiments on ILSVRC 2012 and PASCAL VOC 2007 datasets demonstrate that FD-MobileNet consistently outperforms MobileNet and achieves comparable results with ShuffleNet under different computational budgets, for instance, surpassing MobileNet by 5.5% on the ILSVRC 2012 top-1 accuracy and 3.6% on the VOC 2007 mAP under a complexity of 12 MFLOPs. On an ARM-based device, FD-MobileNet achieves 1.11$\times$ inference speedup over MobileNet and 1.82$\times$ over ShuffleNet under the same complexity.
Link: https://arxiv.org/abs/1802.03750
====================================================
Exploiting Spin-Orbit Torque Devices as Reconfigurable Logic for Circuit Obfuscation (Jianlei Yang - 8 February, 2018)
Circuit obfuscation is a frequently used approach to conceal logic functionalities in order to prevent reverse engineering attacks on fabricated chips. Experiments on MCNC and ISCAS 85/89 benchmark suits show that the proposed approach could reduce the area overheads due to obfuscation by 10% averagely.
Link: https://arxiv.org/abs/1802.02789
====================================================
A Patterns Based Approach for Design of Educational Technologies (Sridhar Chimalakonda - 7 February, 2018)
We then present the notion of Pattern-Oriented Instructional Design (POID) as a way to model instructional design as a connection of patterns (GoalPattern, ProcessPattern, ContentPattern) and integrate it with Pattern-Oriented Software Architecture (POSA) based on fundamental principles in software engineering. We demonstrate our approach through adult literacy case study (287 million learners, 22 Indian Languages and a variety of instructional designs)
Link: https://arxiv.org/abs/1802.02663
====================================================
Ways of Applying Artificial Intelligence in Software Engineering (Robert Feldt - 7 February, 2018)
Some work has been done to better understand the interaction between Software Engineering and AI but we lack methods to classify ways of applying AI in software systems and to analyse and understand the risks this poses. We show the usefulness of this taxonomy by classifying 15 papers from previous editions of the RAISE workshop
Link: https://arxiv.org/abs/1802.02033
====================================================
Online Compact Convexified Factorization Machine (Wenpeng Zhang - 5 February, 2018)
Factorization Machine (FM) is a supervised learning approach with a powerful capability of feature engineering. To evaluate the empirical performance of OCCFM, we conduct extensive experiments on 6 real-world datasets for online recommendation and binary classification tasks
Link: https://arxiv.org/abs/1802.01379
====================================================
Heuristic Feature Selection for Clickbait Detection (Matti Wiegmann - 4 February, 2018)
Unlike most other approaches submitted to the challenge, the baseline approach is based on manual feature engineering and does not compete out of the box with many of the deep learning-based approaches. We show that scaling up feature selection efforts to heuristically identify better-performing feature subsets catapults the performance of the baseline classifier to second rank overall, beating 12 other competing approaches and improving over the baseline performance by 20%
Link: https://arxiv.org/abs/1802.01191
====================================================
Digitalization of Swedish Government Agencies - A Perspective Through the Lens of a Software Development Census (Markus Borg - 11 February, 2018)
Software engineering is at the core of the digitalization of society. Ill-informed decisions can have major consequences, as made evident in the 2017 government crisis in Sweden, originating in a data breach caused by an outsourcing deal made by the Swedish Transport Agency. We show that 39.2% of the GovAgs develop software internally, some matching the number of developers in large companies
Link: https://arxiv.org/abs/1802.00312
====================================================
Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning (Xuan Wang - 7 October, 2018)
Although recent studies explored using neural network models for BioNER to free experts from manual feature engineering, the performance remains limited by the available training data for each entity type. In experiments on 15 benchmark BioNER datasets, our multi-task model achieves substantially better performance compared with state-of-the-art BioNER systems and baseline neural sequence labeling models
Link: https://arxiv.org/abs/1801.09851
====================================================
Semi-Supervised Convolutional Neural Networks for Human Activity Recognition (Ming Zeng - 22 January, 2018)
However, the semi-supervised methods studied in the activity recognition literatures assume that feature engineering is already done. In experiments on three real world datasets, we show that our CNNs outperform supervised methods and traditional semi-supervised learning methods by up to 18% in mean F1-score (Fm).
Link: https://arxiv.org/abs/1801.07827
====================================================
Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning (Ulysse Côté-Allard - 12 June, 2018)
As such, this paper proposes applying transfer learning on the aggregated data of multiple users, while leveraging the capacity of deep learning algorithms to learn discriminant features from large dataset, without the need for in-depth feature engineering. These two datasets are comprised of 19 and 17 able-bodied participants respectively. A third dataset, also recorded with the Myo Armband, was taken from the NinaPro database and is comprised of 10 able-bodied participants. It achieves an average accuracy of 98.31% for 7 hand/wrist gestures over 17 able-bodied participants and 65.57% for 18 hand/wrist gestures over 10 able-bodied participants
Link: https://arxiv.org/abs/1801.07756
====================================================
Guidelines for Systematic Mapping Studies in Security Engineering (Michael Felderer - 21 January, 2018)
Because security engineering is similar to software engineering in that it bridges research and practice, researchers can use the same basic systematic mapping process, as follows: (1) study planning, (2) searching for studies, (3) study selection, (4) study quality assessment, (5) data extraction, (6) data classification, (7) data analysis, and (8) reporting of results
Link: https://arxiv.org/abs/1801.06810
====================================================
Visual Data Augmentation through Learning (Grigorios G. Chrysos - 20 January, 2018)
In addition, the state-of-the-art data-driven methods demand a vast amount of data, hence a standard engineering trick employed is artificial data augmentation for instance by adding into the data cropped and (affinely) transformed images
Link: https://arxiv.org/abs/1801.06665
====================================================
Experience-driven Networking: A Deep Reinforcement Learning based Approach (Zhiyuan Xu - 17 January, 2018)
Specifically, we, for the first time, propose to leverage emerging Deep Reinforcement Learning (DRL) for enabling model-free control in communication networks; and present a novel and highly effective DRL-based control framework, DRL-TE, for a fundamental networking problem: Traffic Engineering (TE). Extensive packet-level simulation results show that 1) compared to several widely-used baseline methods, DRL-TE significantly reduces end-to-end delay and consistently improves the network utility, while offering better or comparable throughput; 2) DRL-TE is robust to network changes; and 3) DRL-TE consistently outperforms a state-ofthe-art DRL method (for continuous control), Deep Deterministic Policy Gradient (DDPG), which, however, does not offer satisfying performance.
Link: https://arxiv.org/abs/1801.05757
====================================================
HeNet: A Deep Learning Approach on Intel$^\circledR$ Processor Trace for Effective Exploit Detection (Li Chen - 8 January, 2018)
The use of hardware trace adds portability to our system and the use of deep learning eliminates the manual effort of feature engineering. HeNet achieves 100\% accuracy and 0\% false positive on test set, and higher classification accuracy compared to classical machine learning algorithms.
Link: https://arxiv.org/abs/1801.02318
====================================================
Online Multicast Traffic Engineering for Software-Defined Networks (Sheng-Hao Chiang - 30 December, 2017)
Different from traditional shortest-path trees (SPT) and graph theoretical Steiner trees (ST), which concentrate on routing one tree at any instant, online SDN multicast traffic engineering is more challenging because it needs to support dynamic group membership and optimize a sequence of correlated trees without the knowledge of future join and leave, whereas the scalability of SDN due to limited TCAM is also crucial. We prove that OBST is NP-hard and does not have a $|D_{max}|^{1-ε}$-competitive algorithm for any $ε>0$, where $|D_{max}|$ is the largest group size at any time. The simulations and implementation on real SDNs with YouTube traffic manifest that the total cost can be reduced by at least 25% compared with SPT and ST, and the computation time is small for massive SDN.
Link: https://arxiv.org/abs/1801.00110
====================================================
Long-Term Mobile Traffic Forecasting Using Deep Spatio-Temporal Neural Networks (Chaoyun Zhang - 21 December, 2017)
Forecasting with high accuracy the volume of data traffic that mobile users will consume is becoming increasingly important for precision traffic engineering, demand-aware network resource allocation, as well as public transportation. Experiments we conduct with real-world mobile traffic data sets, collected over 60 days in both urban and rural areas, demonstrate that the proposed (D-)STN schemes perform up to 10-hour long predictions with remarkable accuracy, irrespective of the time of day when they are triggered. Specifically, our solutions achieve up to 61% smaller prediction errors as compared to widely used forecasting approaches, while operating with up to 600 times shorter measurement intervals.
Link: https://arxiv.org/abs/1712.08083
====================================================
Accelerating the computation of FLAPW methods on heterogeneous architectures (Davor DavidoviÄ - 19 December, 2017)
Legacy codes in computational science and engineering have been very successful in providing essential functionality to researchers. Our final code attains over 70\% of the architectures' peak performance, and outperforms Nvidia's and Intel's libraries
Link: https://arxiv.org/abs/1712.07206
====================================================
Fourteen Years of Software Engineering at ETH Zurich (Bertrand Meyer - 16 December, 2017)
A Chair of Software Engineering existed at ETH Zurich, the Swiss Federal Insti-tute of Technology, from 1 October 2001 to 31 January 2016, under my leader-ship
Link: https://arxiv.org/abs/1712.05078
====================================================
Whole-Body Nonlinear Model Predictive Control Through Contacts for Quadrupeds (Michael Neunert - 7 December, 2017)
Yet, thorough numerical and software engineering allows for running the nonlinear Optimal Control solver at rates up to 190 Hz on a quadruped for a time horizon of half a second
Link: https://arxiv.org/abs/1712.02889
====================================================
Development of Statewide AADT Estimation Model from Short-Term Counts: A Comparative Study for South Carolina (Sakib Mahmud Khan - 29 November, 2017)
Annual Average Daily Traffic (AADT) is an important parameter used in traffic engineering analysis. In South Carolina, 87% of the ATRs are located on interstates and arterial highways. Among all developed models for different functional roadway classes, the SVR-based model shows a minimum root mean square error (RMSE) of 0.22 and a mean absolute percentage error (MAPE) of 11.3% for the interstate/expressway functional class. SVR models are validated for each roadway functional class using the 2016 ATR data and selected short-term count data collected by the South Carolina Department of Transportation (SCDOT)
Link: https://arxiv.org/abs/1712.01257
====================================================
On Using Network Science in Mining Developers Collaboration in Software Engineering: A Systematic Literature Review (Mohammed Abufouda - 3 December, 2017)
This study and its findings are expected to be of benefit for software engineering practitioners and researchers who are mining software repositories using tools from network science field. We identified $35$ primary studies (PSs) from 4 digital libraries, then we extracted data from each PS according to a predefined data extraction sheet
Link: https://arxiv.org/abs/1712.00865
====================================================
Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection (Aaron Tuor - 2 December, 2017)
This work introduces a flexible, powerful, and unsupervised approach to detecting anomalous behavior in computer and network logs, one that largely eliminates domain-dependent feature engineering employed by existing methods. For log-line-level red team detection, our best performing character-based model provides test set area under the receiver operator characteristic curve of 0.98, demonstrating the strong fine-grained anomaly detection performance of this approach on open vocabulary logging sources.
Link: https://arxiv.org/abs/1712.00557
====================================================
Game Development Software Engineering Process Life Cycle: A Systematic Review (Saiqa Aleem - 22 November, 2017)
The purpose of this study is to assesses the state of the art research on the game development software engineering process and highlight areas that need further consideration by researchers
Link: https://arxiv.org/abs/1711.08527
====================================================
Direct and mediating influences of user-developer perception gaps in requirements understanding on user participation (Jingdong Jia - 21 November, 2017)
User participation is considered an effective way to conduct requirements engineering, but user-developer perception gaps in requirements understanding occur frequently. Survey data collected from 140 subjects were examined and analyzed using structural equation modeling
Link: https://arxiv.org/abs/1711.07880
====================================================
Programmatic Control of a Compiler for Generating High-performance Spatial Hardware (Hongbo Rong - 13 December, 2017)
Consequently, high performance is expected with substantially higher productivity: compared with high-performance programming in today's high-level synthesis (HLS) languages or hardware description languages (HDLs), the engineering effort on coding and verification is expected to be reduced from months to hours, a reduction of 2 or 3 orders of magnitude.
Link: https://arxiv.org/abs/1711.07606
====================================================
Implementing the Deep Q-Network (Melrose Roderick - 20 November, 2017)
However, replicating results for complex systems is often challenging since original scientific publications are not always able to describe in detail every important parameter setting and software engineering solution. Finally, we discuss methods for improving the computational performance and provide our own implementation that is designed to work with a range of domains, and not just the original Arcade Learning Environment [Bellemare et al., 2013].
Link: https://arxiv.org/abs/1711.07478
====================================================
Towards the Adoption of Anti-spoofing Protocols (Hang Hu - 6 February, 2018)
Second, to understand the reasons behind the low-adoption rate, we collect 4293 discussion threads (25.7K messages) from the Internet Engineering Task Force (IETF), a working group formed to develop and promote Internet standards
Link: https://arxiv.org/abs/1711.06654
====================================================
Student Success Prediction in MOOCs (Josh Gardner - 10 April, 2018)
We critically survey work across each category, providing data on the raw data source, feature engineering, statistical model, evaluation method, prediction architecture, and other aspects of these experiments. Such a review is particularly useful given the rapid expansion of predictive modeling research in MOOCs since the emergence of major MOOC platforms in 2012
Link: https://arxiv.org/abs/1711.06349
====================================================
Can clone detection support quality assessments of requirements specifications? (Elmar Juergens - 15 November, 2017)
Due to their pivotal role in software engineering, considerable effort is spent on the quality assurance of software requirements specifications. This paper describes a large-scale case study that applied clone detection to 28 requirements specifications with a total of 8,667 pages
Link: https://arxiv.org/abs/1711.05472
====================================================
SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring (Yi Tay - 14 November, 2017)
Our approach demonstrates state-of-the-art performance on the benchmark ASAP dataset, outperforming not only feature engineering baselines but also other deep learning models.
Link: https://arxiv.org/abs/1711.04981
====================================================
Souper: A Synthesizing Superoptimizer (Raimondas Sasnauskas - 5 April, 2018)
If we can automatically derive compiler optimizations, we might be able to sidestep some of the substantial engineering challenges involved in creating and maintaining a high-quality compiler. Alternately, when Souper is used as a fully automated optimization pass it compiles a Clang compiler binary that is about 3 MB (4.4%) smaller than the one compiled by LLVM.
Link: https://arxiv.org/abs/1711.04422
====================================================
Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation (Chunqi Wang - 12 November, 2017)
Without any feature engineering, the model obtains competitive performance -- 95.7% on PKU and 97.3% on MSR. Armed with word embeddings, the model achieves state-of-the-art performance on both datasets -- 96.5% on PKU and 98.0% on MSR, without using any external labeled resource.
Link: https://arxiv.org/abs/1711.04411
====================================================
Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey (Hamed Jelodar - 12 November, 2017)
Researchers have published many articles in the field of topic modeling and applied in various fields such as software engineering, political science, medical and linguistic science, etc. In this paper, we investigated scholarly articles highly (between 2003 to 2016) related to Topic Modeling based on LDA to discover the research development, current trends and intellectual structure of topic modeling
Link: https://arxiv.org/abs/1711.04305
====================================================
p-FP: Extraction, Classification, and Prediction of Website Fingerprints with Deep Learning (Se Eun Oh - 2 April, 2018)
Recent advances in learning Deep Neural Network (DNN) architectures have received a great deal of attention due to their ability to outperform state-of-the-art classifiers across a wide range of applications, with little or no feature engineering. Finally, we show that DNNs can be used to predict the fingerprintability of a website based on its contents, achieving 99% accuracy on a data set of 4500 website downloads.
Link: https://arxiv.org/abs/1711.03656
====================================================
Characterizing the structural diversity of complex networks across domains (Kansuke Ikehara - 30 October, 2017)
The structure of complex networks has been of interest in many scientific and engineering disciplines over the decades. In this paper, we study 986 real-world networks of diverse domains ranging from ecological food webs to online social networks along with 575 networks generated from four popular network models
Link: https://arxiv.org/abs/1710.11304
====================================================
A Supervised STDP-based Training Algorithm for Living Neural Networks (Yuan Zeng - 21 March, 2018)
A new supervised STDP-based learning algorithm is proposed in this work, which considers neuron engineering constrains. A 74.7% accuracy is achieved on the MNIST benchmark for handwritten digit recognition.
Link: https://arxiv.org/abs/1710.10944
====================================================
On modeling vagueness and uncertainty in data-to-text systems through fuzzy sets (A. Ramos-Soto - 27 October, 2017)
This paper intends to bridge this gap by answering the following questions: What does vagueness mean in fuzzy sets theory? What does vagueness mean in data-to-text contexts? In what ways can fuzzy sets theory contribute to improve data-to-text systems? What are the challenges that researchers from both disciplines need to address for a successful integration of fuzzy sets into data-to-text systems? In what cases should the use of fuzzy sets be avoided in D2T? For this, we review and discuss the state of the art of vagueness modeling in natural language generation and data-to-text, describe potential and actual usages of fuzzy sets in data-to-text contexts, and provide some additional insights about the engineering of data-to-text systems that make use of fuzzy set-based techniques.
Link: https://arxiv.org/abs/1710.10093
====================================================
Adapting Engineering Education to Industrie 4.0 Vision (Selim Coskun - 24 October, 2017)
An important part of the tasks in the preparation for Industrie 4.0 is the adaption of the higher education to the requirements of this vision, in particular the engineering education
Link: https://arxiv.org/abs/1710.08806
====================================================
REPETITA: Repeatable Experiments for Performance Evaluation of Traffic-Engineering Algorithms (Steven Gay - 24 October, 2017)
In this paper, we propose a pragmatic approach to improve reproducibility of experimental analyses of traffic engineering (TE) algorithms, whose implementation, evaluation and comparison are currently hard to replicate. In its current version, REPETITA includes (i) a dataset for repeatable experiments, consisting of more than 250 real network topologies with complete bandwidth and delay information as well as associated traffic matrices; and (ii) the implementation of state-of-the-art algorithms for intra-domain TE with IGP weight tweaking and Segment Routing optimization
Link: https://arxiv.org/abs/1710.08665
====================================================
On Neuromechanical Approaches for the Study of Biological Grasp and Manipulation (Francisco J Valero-Cuevas - 23 October, 2017)
engineering synthesis for the design and construction of robotic systems. The past 20 years have seen several conceptual advances in both fields and the quest to unify them
Link: https://arxiv.org/abs/1710.08557
====================================================
Solving the "false positives" problem in fraud prediction (Roy Wedge - 20 October, 2017)
In this paper, we present an automated feature engineering based approach to dramatically reduce false positives in fraud prediction. It is estimated that only 1 in 5 declared as fraud are actually fraud and roughly 1 in every 6 customers have had a valid transaction declined in the past year. We generate 237 features (>100 behavioral patterns) for each transaction, and use a random forest to learn a classifier. On an unseen data of 1.852 million transactions, we were able to reduce the false positives by 54% and provide a savings of 190K euros. We found that our solution can maintain similar benefits even when historical features are computed once every 7 days.
Link: https://arxiv.org/abs/1710.07709
====================================================
Communication-free Massively Distributed Graph Generation (Daniel Funke - 23 October, 2017)
However, engineering such algorithms is often hindered by the scarcity of publicly available datasets. Overall, we are able to generate instances of up to $2^{43}$ vertices and $2^{47}$ edges in less than 22 minutes on 32768 processors
Link: https://arxiv.org/abs/1710.07565
====================================================
SQG-Differential Evolution for difficult optimization problems under a tight function evaluation budget (Ramses Sala - 12 July, 2018)
Some of the most challenging simulation-based engineering design optimization problems are characterized by: a large number of design variables, the absence of analytical gradients, highly non-linear objectives and a limited function evaluation budget. The performance of the resulting derivative-free algorithm is compared with other state-of-the-art DE variants on 25 commonly used benchmark functions, under tight function evaluation budget constraints of 1000 evaluations
Link: https://arxiv.org/abs/1710.06770
====================================================
How PHP Releases Are Adopted in the Wild? (Jukka Ruohonen - 16 October, 2017)
Motivated by continuous software engineering practices and software traceability improvements for release engineering, the empirical analysis is based on big data collected by web crawling. To some extent, (iv) the results vary between the recent history from 2016 to early 2017 and the long-run evolution in the 2010s
Link: https://arxiv.org/abs/1710.05570
====================================================
Clickbait Detection in Tweets Using Self-attentive Network (Yiwei Zhou - 15 October, 2017)
The self-attentive neural network can be trained end-to-end, without involving any manual feature engineering. Our detector ranked first in the final evaluation of Clickbait Challenge 2017.
Link: https://arxiv.org/abs/1710.05364
====================================================
On Parallel Solution of Sparse Triangular Linear Systems in CUDA (Ruipeng Li - 13 October, 2017)
Solving linear systems with sparse triangular structured matrices is another important sparse kernel as demanded by a variety of scientific and engineering applications such as sparse linear solvers. Numerical results have indicated that the CUDA implementations of the proposed algorithms can outperform the state-of-the-art solvers in cuSPARSE by a factor of up to $2.6$ for structured model problems and general sparse matrices.
Link: https://arxiv.org/abs/1710.04985
====================================================
End-to-end Network for Twitter Geolocation Prediction and Hashing (Jey Han Lau - 13 October, 2017)
Our model is language independent, and despite minimal feature engineering, it is interpretable and capable of learning location indicative words and timing patterns. Compared to state-of-the-art systems, our model outperforms them by 2%-6%
Link: https://arxiv.org/abs/1710.04802
====================================================
Supporting Requirements Engineering Research that Industry Needs: The Naming the Pain in Requirements Engineering Initiative (Daniel Méndez Fernández - 12 October, 2017)
In light of the 40th jubilee of Requirements Engineering (RE), roughly 40 experts met in Switzerland to discuss where our discipline stands today. However, it is also evident that after 40 years of promising research, conducting research that industry needs is still an ongoing challenge
Link: https://arxiv.org/abs/1710.04630
====================================================
Identifying Clickbait: A Multi-Strategy Approach Using Neural Networks (Vaibhav Kumar - 1 August, 2018)
We conduct experiments over a test corpus of 19538 social media posts, attaining an F1 score of 65.37% on the dataset bettering the previous state-of-the-art, as well as other proposed approaches, feature engineering or otherwise.
Link: https://arxiv.org/abs/1710.01507
====================================================
A Fully Convolutional Network for Semantic Labeling of 3D Point Clouds (Mohammed Yousefhussien - 3 October, 2017)
When classifying point clouds, a large amount of time is devoted to the process of engineering a reliable set of features which are then passed to a classifier of choice. Evaluated using the ISPRS 3D Semantic Labeling Contest, our method scored second place with an overall accuracy of 81.6%. We ranked third place with a mean F1-score of 63.32%, surpassing the F1-score of the method with highest accuracy by 1.69%
Link: https://arxiv.org/abs/1710.01408
====================================================
Indexing the Event Calculus with Kd-trees to Monitor Diabetes (Stefano Bromuri - 3 October, 2017)
In order to tackle the knowledge representation and efficiency problem, this contribution presents the kd-tree cached event calculus (\ceckd) an event calculus extension for knowledge engineering of temporal rules capable to handle many thousands events produced by a diabetic patient. \ceckd\ is built as a support to a graphical interface to represent monitoring rules for diabetes type 1
Link: https://arxiv.org/abs/1710.01275
====================================================
A Simple and Efficient MapReduce Algorithm for Data Cube Materialization (Mukund Sundararajan - 28 September, 2017)
Nandi et al.~(Transactions on Knowledge and Data Engineering, Vol.~6) first studied cube materialization for large scale datasets using the MapReduce framework, and proposed a sophisticated modification of a simple broadcast algorithm to handle a dataset with a 216GB cube size within 25 minutes with 2k machines in 2012. As a result, the algorithm shows excellent performance, and materialized a real dataset with a cube size of 35.0G tuples and 1.75T bytes in 54 minutes, with 0.4k machines in 2014.
Link: https://arxiv.org/abs/1709.10072
====================================================
Stream VByte: Faster Byte-Oriented Integer Compression (Daniel Lemire - 27 September, 2017)
They are appealing due to their simplicity and engineering convenience. On a 3.4GHz Haswell processor, it decodes more than 4 billion differentially-coded integers per second from RAM to L1 cache.
Link: https://arxiv.org/abs/1709.08990
====================================================
Report from GI-Dagstuhl Seminar 16394: Software Performance Engineering in the DevOps World (Andre van Hoorn - 26 September, 2017)
This report documents the program and the outcomes of GI-Dagstuhl Seminar 16394 "Software Performance Engineering in the DevOps World".
Link: https://arxiv.org/abs/1709.08951
====================================================
Deep Learning Based Cryptographic Primitive Classification (Gregory D. Hill - 25 September, 2017)
Reverse engineering potentially malicious software is a cumbersome task due to platform eccentricities and obfuscated transmutation mechanisms, hence requiring smarter, more efficient detection strategies. Converging at 91% accuracy, CryptoKnight is successfully able to classify the sample algorithms with minimal loss.
Link: https://arxiv.org/abs/1709.08385
====================================================
By Hook or by Crook: Exposing the Diverse Abuse Tactics of Technical Support Scammers (Bharat Srinivasan - 25 September, 2017)
Technical Support Scams (TSS), which combine online abuse with social engineering over the phone channel, have persisted despite several law enforcement actions. Our study period of 8 months uncovered over 9,000 TSS domains, of both passive and aggressive types, with minimal overlap between sets that are reached via organic search results and sponsored ads
Link: https://arxiv.org/abs/1709.08331
====================================================
Achieving CMMI Level 2 with Enhanced Extreme Programming Approach (Tuomo Kähkönen - 20 September, 2017)
The relationship between agile methods and Software Engineering Institute's CMM approach is often debated. The results provide empirical evidence pointing out that it is possible to achieve maturity level 2 with approach based on XP
Link: https://arxiv.org/abs/1709.06822
====================================================
Model-driven Engineering IDE for Quality Assessment of Data-intensive Applications (Marc Gil - 19 July, 2017)
This article introduces a model-driven engineering (MDE) integrated development environment (IDE) for Data-Intensive Cloud Applications (DIA) with iterative quality enhancements. As part of the H2020 DICE project (ICT-9-2014, id 644869), a framework is being constructed and it is composed of a set of tools developed to support a new MDE methodology
Link: https://arxiv.org/abs/1709.06516
====================================================
Investigating Storage as a Service Cloud Platform: pCloud as a Case Study (Tooska Dargahi - 13 September, 2017)
There are many ways for criminals to compromise cloud services; ranging from non-technical attack methods, such as social engineering, to deploying advanced malwares. We carried out our experiments on four different virtual machines running four popular operating systems: a 64 bit Windows 8, Ubuntu 14.04.1 LTS, Android 4.4.2, and iOS 8.1
Link: https://arxiv.org/abs/1709.04417
====================================================
Should I Stay or Should I Go? On Forces that Drive and Prevent MBSE Adoption in the Embedded Systems Industry (Andreas Vogelsang - 1 September, 2017)
[Context] Model-based Systems Engineering (MBSE) comprises a set of models and techniques that is often suggested as solution to cope with the challenges of engineering complex systems. [Method] Our results are based on 20 interviews with experts from 10 companies
Link: https://arxiv.org/abs/1709.00266
====================================================
An Empirical Study of Discriminative Sequence Labeling Models for Vietnamese Text Processing (Phuong Le-Hong - 30 August, 2017)
We show that a strong lower bound for labeling accuracy can be obtained by relying only on simple word-based features with minimal hand-crafted feature engineering, of 90.65\% and 86.03\% performance scores on the standard test sets for the two tasks respectively
Link: https://arxiv.org/abs/1708.09163
====================================================
A Scalable and Extensible Checkpointing Scheme for Massively Parallel Simulations (Nils Kohl - 29 January, 2018)
Realistic simulations in engineering or in the materials sciences can consume enormous computing resources and thus require the use of massively parallel supercomputers. We demonstrate the efficiency and scalability of the checkpoint strategy for simulations with up to $40$ billion computational cells executing on more than $400$ billion floating point values. A checkpoint creation is shown to require only a few seconds and the new checkpointing scheme scales almost perfectly up to more than $260\,000$ ($2^{18}$) processes
Link: https://arxiv.org/abs/1708.08286
====================================================
Automated Website Fingerprinting through Deep Learning (Vera Rimmer - 5 December, 2017)
In this paper, we show that an adversary can automate the feature engineering process, and thus automatically deanonymize Tor traffic by applying our novel method based on deep learning. The obtained success rate exceeds 96% for a closed world of 100 websites and 94% for our biggest closed world of 900 classes. In our open world evaluation, the most performant deep learning model is 2% more accurate than the state-of-the-art attack
Link: https://arxiv.org/abs/1708.06376
====================================================
Evaluation of Human-Robot Collaboration Models for Fluent Operations in Industrial Tasks (Lior Sayfeld - 16 August, 2017)
Eighty industrial engineering students aged 22-27 participated in experiments in which timing and sensor based models were compared to an adaptive model developed within this framework. The results showed conclusively that the adaptive system improved the examined parameters and provided an improvement of 7% in total assembly time and 60% in total idle time when compared to timing and sensory based models.
Link: https://arxiv.org/abs/1708.04790
====================================================
Controlled Experiments with Student Participants in Software Engineering: Preliminary Results from a Systematic Mapping Study (Marian Daun - 15 August, 2017)
[Methods] This paper reports on a systematic mapping study using high-quality journals and conferences from the software engineering field as data sources. We scanned all papers published between 2010 and 2014 and investigated all papers reporting student experiments in detail. [Results] From 2788 papers under investigation 175 report results from controlled experiments. 109 (62.29%) of these controlled experiments have been conducted with student participants
Link: https://arxiv.org/abs/1708.04662
====================================================
Preconditioning immersed isogeometric finite element methods with application to flow problems (Frits de Prenter - 11 August, 2017)
[Computer Methods in Applied Mechanics and Engineering, 316 (2017) pp
Link: https://arxiv.org/abs/1708.03519
====================================================
Argument Labeling of Explicit Discourse Relations using LSTM Neural Networks (Sohail Hooda - 7 September, 2017)
Using the PDTB dataset, our best model achieved an F1 measure of 23.05% without any feature engineering. This is significantly higher than the 20.52% achieved by the state of the art RNN approach, but significantly lower than the feature based state of the art systems
Link: https://arxiv.org/abs/1708.03425
====================================================
Gradient-enhanced kriging for high-dimensional problems (Mohamed Amine Bouhlel - 8 August, 2017)
To validate our method, we compare the global accuracy of the proposed method with conventional kriging surrogate models on two analytic functions with up to 100 dimensions, as well as engineering problems of varied complexity with up to 15 dimensions. In some cases, we get over 3 times more accurate models than a bench of surrogate models from the literature, and also over 3200 times faster than standard gradient-enhanced kriging models.
Link: https://arxiv.org/abs/1708.02663
====================================================
Learning Transferable Architectures for Scalable Image Recognition (Barret Zoph - 11 April, 2018)
Developing neural network image classification models often requires significant architecture engineering. On CIFAR-10 itself, NASNet achieves 2.4% error rate, which is state-of-the-art. On ImageNet, NASNet achieves, among the published works, state-of-the-art accuracy of 82.7% top-1 and 96.2% top-5 on ImageNet. Our model is 1.2% better in top-1 accuracy than the best human-invented architectures while having 9 billion fewer FLOPS - a reduction of 28% in computational demand from the previous state-of-the-art model. For instance, a small version of NASNet also achieves 74% top-1 accuracy, which is 3.1% better than equivalently-sized, state-of-the-art models for mobile platforms. Finally, the learned features by NASNet used with the Faster-RCNN framework surpass state-of-the-art by 4.0% achieving 43.1% mAP on the COCO dataset.
Link: https://arxiv.org/abs/1707.07012
====================================================
Learn More, Pay Less! Lessons Learned from Applying the Wizard-of-Oz Technique for Exploring Mobile App Requirements (Zahra Shakeri Hossein Abad - 17 July, 2017)
To compare the role of early interactive requirements specification and app reviews, we conducted two studies (i) a case study analysis on 13 mobile app development teams who used very early stages Requirements Engineering (RE) by applying WOz, and (ii) a study analyzing 40 (70, 592 reviews) similar mobile apps on Google Play
Link: https://arxiv.org/abs/1707.05272
====================================================
Cognitive Biases in Software Engineering: A Systematic Mapping Study (Rahul Mohanani - 20 June, 2018)
This paper therefore systematically maps, aggregates and synthesizes the literature on cognitive biases in software engineering to generate a comprehensive body of knowledge, understand state of the art research and provide guidelines for future research and practise. Focusing on bias antecedents, effects and mitigation techniques, we identified 65 articles, which investigate 37 cognitive biases, published between 1990 and 2016
Link: https://arxiv.org/abs/1707.03869
====================================================
What Works Better? A Study of Classifying Requirements (Zahra Shakeri Hossein Abad - 7 July, 2017)
Classifying requirements into functional requirements (FR) and non-functional ones (NFR) is an important task in requirements engineering. Our study is performed on 625 requirements provided by the OpenScience tera-PROMISE repository
Link: https://arxiv.org/abs/1707.02358
====================================================
Applying the Polyhedral Model to Tile Time Loops in Devito (Dylan McCormick - 30 June, 2017)
Some of the most effective of these optimizations are not suitable for development by hand or require advanced software engineering knowledge which is beyond the level of many researchers who are not specialists in code optimization. We present a loop-tiling optimization which can be applied to Devito-generated loops and improves run time by up to 27.5%, and options for automating this optimization in the Devito framework.
Link: https://arxiv.org/abs/1707.02347
====================================================
A Generalised Seizure Prediction with Convolutional Neural Networks for Intracranial and Scalp Electroencephalogram Data Analysis (Nhan Duy Truong - 6 December, 2017)
However, many works put heavily handcraft feature extraction and/or carefully tailored feature engineering to each patient to achieve very high sensitivity and low false prediction rate for a particular dataset. We use Short-Time Fourier Transform (STFT) on 30-second EEG windows with 50% overlapping to extract information in both frequency and time domains. The proposed approach achieves sensitivity of 81.4%, 81.2%, 82.3% and false prediction rate (FPR) of 0.06/h, 0.16/h, 0.22/h on Freiburg Hospital intracranial EEG (iEEG) dataset, Children's Hospital of Boston-MIT scalp EEG (sEEG) dataset, and Kaggle American Epilepsy Society Seizure Prediction Challenge's dataset, respectively
Link: https://arxiv.org/abs/1707.01976
====================================================
A Visual Narrative Path from Switching to Resuming a Requirements Engineering Task (Zahra Shakeri Hossein Abad - 6 July, 2017)
Requirements Engineering (RE) is closely tied to other development activities and is at the heart and foundation of every software development process. Moreover, we surveyed 53 software developers to test our visual prototype and to explore more required features for the visual and analytical layers of our framework.
Link: https://arxiv.org/abs/1707.01921
====================================================
Task Interruptions in Requirements Engineering: Reality versus Perceptions! (Zahra Shakeri Hossein Abad - 3 July, 2017)
Task switching and interruptions are a daily reality in software development projects: developers switch between Requirements Engineering (RE), coding, testing, daily meetings, and other tasks. In this paper, to compare the reality of task switching in RE with the perception of developers, we conducted two studies: (i) a case study analysis on 5,076 recorded tasks of 19 developers and (ii) a survey of 25 developers
Link: https://arxiv.org/abs/1707.00794
====================================================
A new class of permutation trinomials constructed from Niho exponents (Tao Bai - 3 October, 2017)
Permutation polynomials over finite fields are an interesting subject due to their important applications in the areas of mathematics and engineering. It is shown that when $p=3$ or $5$, $f(x)$ is a permutation trinomial of $\mathbb{F}_{q^2}$ if and only if $k$ is even
Link: https://arxiv.org/abs/1707.00549
====================================================
On Evidence-based Risk Management in Requirements Engineering (Daniel Méndez Fernández - 31 July, 2017)
Background: The sensitivity of Requirements Engineering (RE) to the context makes it difficult to efficiently control problems therein, thus, hampering an effective risk management devoted to allow for early corrective or even preventive measures. Research Method: We use survey data from 228 companies and build a probabilistic network that supports the forecast of context-specific RE phenomena. Results: Our results from an initial validation in 6 companies strengthen our confidence that the approach increases the awareness for individual risk factors in RE, and the feedback further allows for disseminating our approach into practice.
Link: https://arxiv.org/abs/1707.00144
====================================================
Speaking Style Authentication Using Suprasegmental Hidden Markov Models (Ismail Shahin - 29 June, 2017)
The importance of speaking style authentication from human speech is gaining an increasing attention and concern from the engineering community. Based on using SPHMMs, our results show that the average speaking style authentication performance is: 99%, 37%, 85%, 60%, 61%, 59%, 41%, 61%, and 57% belonging respectively to the speaking styles: neutral, shouted, slow, loud, soft, fast, angry, happy, and fearful.
Link: https://arxiv.org/abs/1706.09736
====================================================
Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition (Inigo Jauregi Unanue - 24 June, 2018)
Previous state-of-the-art systems on Drug Name Recognition (DNR) and Clinical Concept Extraction (CCE) have focused on a combination of text "feature engineering" and conventional machine learning algorithms such as conditional random fields and support vector machines. The specialized embeddings have helped to cover unusual words in DDI-DrugBank and DDI-MedLine, but not in the 2010 i2b2/VA IRB Revision dataset
Link: https://arxiv.org/abs/1706.09569
====================================================
Sketches and Diagrams in Practice (Sebastian Baltes - 28 June, 2017)
In this paper, we investigate the use of sketches and diagrams in software engineering practice. We present the results of an exploratory study in three companies and an online survey with 394 participants
Link: https://arxiv.org/abs/1706.09172
====================================================
Risk-Informed Interference Assessment for Shared Spectrum Bands: A Wi-Fi/LTE Coexistence Case Study (Andra M. Voicu - 8 June, 2017)
In this paper we demonstrate the benefit of risk-informed interference assessment to aid spectrum regulators in making decisions, and to readily convey engineering insight. Wi-Fi/LTE in the 5 GHz unlicensed band, and we demonstrate that this method comprehensively quantifies the interference impact. Our results show that no regulatory intervention is needed to ensure harmonious technical Wi-Fi/LTE coexistence: for the typically large number of channels available in the 5 GHz band, the risk for Wi-Fi from LTE is negligible, rendering policy and engineering concerns largely moot. For LTE intra-technology inter-operator coexistence, both variants typically coexist well in the 5 GHz band, but for dense deployments, implementing listen-before-talk causes less interference.
Link: https://arxiv.org/abs/1706.02479
====================================================
Can Pairwise Testing Perform Comparably to Manually Handcrafted Testing Carried Out by Industrial Engineers? (Peter Charbachi - 6 June, 2017)
Testing is an important activity in engineering of industrial software. In this study we compare pairwise test suites with test suites created manually by engineers for 45 industrial programs. The results also suggest that pairwise testing is just as good as manual testing at fault detection for 64% of the programs.
Link: https://arxiv.org/abs/1706.01636
====================================================
Tailoring Architecture Centric Design Method with Rapid Prototyping (Nitish M. Devadiga - 6 June, 2017)
Two features commonly lacking from many engineering processes are, 1) the formal capacity to rapidly develop prototypes in the rudimentary stage of the project, 2) transitioning of requirements into architectural designs, when and how to evaluate designs and how to use the throw away prototypes throughout the system lifecycle
Link: https://arxiv.org/abs/1706.01602
====================================================
One button machine for automating feature engineering in relational databases (Hoang Thanh Lam - 1 June, 2017)
Feature engineering is one of the most important and time consuming tasks in predictive analytics projects. We validated OneBM in Kaggle competitions in which OneBM achieved performance as good as top 16% to 24% data scientists in three Kaggle competitions
Link: https://arxiv.org/abs/1706.00327
====================================================
A Snowballing Literature Study on Test Amplification (Benjamin Danglot - 26 July, 2018)
This article surveys various works that aim at exploiting this knowledge in order to enhance these manually written tests with respect to an engineering goal (e.g., improve coverage of changes or increase the accuracy of fault localization). We reviewed the 70 papers in this set and selected the 4 papers that fit our definition of test amplification. We use these 4 papers as the seed for our snowballing study, and systematically followed the citation graph
Link: https://arxiv.org/abs/1705.10692
====================================================
A Deep Multi-View Learning Framework for City Event Extraction from Twitter Data Streams (Nazli Farajidavar - 28 May, 2017)
Our goal has been to build a flexible architecture that can learn representations useful for tasks, thus avoiding excessive task-specific feature engineering. The results of our evaluations show that our proposed solution outperforms the existing models and can be used for extracting city related events with an averaged accuracy of 81% over all classes. The analysis showed that 49.5% of the Twitter traffic comments are reported approximately five hours prior to the authorities official records. Moreover, we discovered that amongst the scheduled sociocultural event topics; tweets reporting transportation, cultural and social events are 31.75% more likely to influence the distribution of the Twitter comments than sport, weather and crime topics.
Link: https://arxiv.org/abs/1705.09975
====================================================
An Empirical Analysis of Approximation Algorithms for the Euclidean Traveling Salesman Problem (Yihui He - 25 May, 2017)
With applications to many disciplines, the traveling salesman problem (TSP) is a classical computer science optimization problem with applications to industrial engineering, theoretical computer science, bioinformatics, and several other disciplines. We use several datasets as input for the algorithms including a small dataset, a mediumsized dataset representing cities in the United States, and a synthetic dataset consisting of 200 cities to test algorithm scalability
Link: https://arxiv.org/abs/1705.09058
====================================================
Spelling Correction as a Foreign Language (Yingbo Zhou - 20 May, 2017)
The model offers competitive performance as compared to the state of the art methods but does not require any feature engineering nor hand tuning between models.
Link: https://arxiv.org/abs/1705.07371
====================================================
How do Practitioners Perceive the Relevance of Requirements Engineering Research? An Ongoing Study (X. Franch - 14 June, 2017)
The relevance of Requirements Engineering (RE) research to practitioners is a prerequisite for problem-driven research in the area and key for a long-term dissemination of research results to everyday practice. To this end, we have designed a survey to be sent to several hundred industry practitioners at various companies around the world and ask them to rate their perceived practical relevance of the research described in a sample of 418 RE papers published between 2010 and 2015 at the RE, ICSE, FSE, ESEC/FSE, ESEM and REFSQ conferences
Link: https://arxiv.org/abs/1705.06013
====================================================
A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing (Dat Quoc Nguyen - 8 June, 2017)
Our model uses bidirectional LSTMs to learn feature representations shared for both POS tagging and dependency parsing tasks, thus handling the feature-engineering problem. Our extensive experiments, on 19 languages from the Universal Dependencies project, show that our model outperforms the state-of-the-art neural network-based Stack-propagation model for joint POS tagging and transition-based dependency parsing, resulting in a new state of the art
Link: https://arxiv.org/abs/1705.05952
====================================================
Design Criteria to Architect Continuous Experimentation for Self-Driving Vehicles (Federico Giaimo - 12 June, 2017)
The software powering today's vehicles surpasses mechatronics as the dominating engineering challenge due to its fast evolving and innovative nature. In addition, the software and system architecture for upcoming vehicles with automated driving functionality is already processing ~750MB/s - corresponding to over 180 simultaneous 4K-video streams from popular video-on-demand services
Link: https://arxiv.org/abs/1705.05170
====================================================
FLASH: A Faster Optimizer for SBSE Tasks (Vivek Nair - 18 May, 2017)
Most problems in search-based software engineering involve balancing conflicting objectives. FLASH was found to be the fastest optimizer (sometimes requiring less than 1% of the evaluations used by evolutionary algorithms)
Link: https://arxiv.org/abs/1705.05018
====================================================
Drug-drug Interaction Extraction via Recurrent Neural Network with Multiple Attention Layers (Zibo Yi - 18 May, 2017)
However, the existing work utilize either complex feature engineering or NLP tools, both of which are insufficient for sentence comprehension. We evaluate our model on 2013 SemEval DDIExtraction dataset
Link: https://arxiv.org/abs/1705.03261
====================================================
Identifying combinations of tetrahedra into hexahedra: a vertex based strategy (Jeanne Pellerin - 4 January, 2018)
All identified cells are valid for engineering analysis. Around 3 millions potential hexahedra are computed in 10 seconds on a laptop
Link: https://arxiv.org/abs/1705.02451
====================================================
Parameter reduction in nonlinear state-space identification of hysteresis (Alireza Fakhrizadeh Esfahani - 29 April, 2017)
Hysteresis is a highly nonlinear phenomenon, showing up in a wide variety of science and engineering problems. We have found that the presented decoupling approach is able to reduce the number of parameters of the full nonlinear model up to about 50\%, while maintaining a comparable output error level.
Link: https://arxiv.org/abs/1705.00178
====================================================
A Methodology of Guiding Web Content Mining and Knowledge Discovery in Evidence-based Software Engineering (Zheng Li - 25 April, 2017)
Systematic Literature Review (SLR) is a rigorous methodology applied for Evidence-Based Software Engineering (EBSE) that identify, assess and synthesize the relevant evidence for answering specific research questions. Benefiting from the booming online materials in the era of Web 2.0, the technical Web content starts acting as alternative sources for EBSE
Link: https://arxiv.org/abs/1704.07551
====================================================
FEUP at SemEval-2017 Task 5: Predicting Sentiment Polarity and Intensity with Financial Word Embeddings (Pedro Saleiro - 17 April, 2017)
This paper presents the approach developed at the Faculty of Engineering of University of Porto, to participate in SemEval 2017, Task 5: Fine-grained Sentiment Analysis on Financial Microblogs and News. We used an external collection of tweets and news headlines mentioning companies/stocks from S\&P 500 to create financial word embeddings which are able to capture domain-specific syntactic and semantic similarities. The resulting approach obtained a cosine similarity score of 0.69 in sub-task 5.1 - Microblogs and 0.68 in sub-task 5.2 - News Headlines.
Link: https://arxiv.org/abs/1704.05091
====================================================
Scalable Rate Control for Traffic Engineering with Aggregated Flows in Software Defined Networks (Jian-Jhih Kuo - 15 August, 2017)
To increase the scalability of Software Defined Networks (SDNs), flow aggregation schemes have been proposed to merge multiple mouse flows into an elephant aggregated flow for traffic engineering. Simulation results based on real networks manifest that JFSRD performs nearly optimally in small-scale networks, and the number of controlled flows can be effectively reduced by 50% in real networks.
Link: https://arxiv.org/abs/1704.04182
====================================================
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks (Valentin Flunkert - 5 July, 2017)
We show through extensive empirical evaluation on several real-world forecasting data sets that our methodology is more accurate than state-of-the-art models, while requiring minimal feature engineering.
Link: https://arxiv.org/abs/1704.04110
====================================================
How Professional Hackers Understand Protected Code while Performing Attack Tasks (Mariano Ceccato - 26 May, 2017)
Code protections aim at blocking (or at least delaying) reverse engineering and tampering attacks to critical assets within programs. Our qualitative analysis of the reports consists of open coding, carried out by 7 annotators and resulting in 459 annotations, followed by concept extraction and model inference
Link: https://arxiv.org/abs/1704.02774
====================================================
The Many Faces of Link Fraud (Neil Shah - 11 September, 2017)
We discuss how to leverage our insights in practice by engineering strongly performing entropy-based features and demonstrating high classification accuracy. Our contributions are (a) instrumentation: we detail our experimental setup and carefully engineered data collection process to scrape Twitter data while respecting API rate-limits, (b) observations on fraud multimodality: we analyze our honeypot fraudster ecosystem and give surprising insights into the multifaceted behaviors of these fraudster types, and (c) features: we propose novel features that give strong (>0.95 precision/recall) discriminative power on ground-truth Twitter data.
Link: https://arxiv.org/abs/1704.01420
====================================================
Conical: an extended module for computing a numerically satisfactory pair of solutions of the differential equation for conical functions (T. M. Dunster - 4 April, 2017)
Conical functions appear in a large number of applications in physics and engineering. Specifically, the module includes now a routine for computing the function ${\rm R}^{m}_{-\frac{1}{2}+iÏ}(x)$, a real-valued numerically satisfactory companion of the function ${\rm P}^m_{-\tfrac12+iÏ}(x)$ for $x>1$
Link: https://arxiv.org/abs/1704.01145
====================================================
Survey Research in Software Engineering: Problems and Strategies (Ahmad Nauman Ghazi - 4 April, 2017)
The researchers are all focused on empirical software engineering. Results: We identified 24 problems and 65 strategies, structured according to the survey research process
Link: https://arxiv.org/abs/1704.01090
====================================================
Review on Requirements Modeling and Analysis for Self-Adaptive Systems: A Ten-Year Perspective (Zhuoqun Yang - 3 April, 2017)
Context: Over the last decade, software researchers and engineers have developed a vast body of methodologies and technologies in requirements engineering for self-adaptive systems. To ensure the quality of the study, we choose 21 highly regarded publication venues and 8 popular digital libraries. Results: We selected 109 papers during the period of 2003-2013 and presented the research distributions over various kinds of factors. We extracted 29 modeling methods which are classified into 8 categories and identified 14 requirements activities which are classified into 4 requirements timelines. We captured 8 concerned software quality attributes based on the ISO 9126 standard and 12 application domains
Link: https://arxiv.org/abs/1704.00421
====================================================
Syntax Aware LSTM Model for Chinese Semantic Role Labeling (Feng Qian - 19 April, 2017)
The structure of SA-LSTM modifies according to dependency parsing information in order to model parsing information directly in an architecture engineering way instead of feature engineering way. Furthermore, SA-LSTM outperforms the state-of-the-art on CPB 1.0 significantly according to Student t-test ($p<0.05$).
Link: https://arxiv.org/abs/1704.00405
====================================================
What Is the Best Way For Developers to Learn New Software Tools? An Empirical Comparison Between a Text and a Video Tutorial (Verena Käfer - 31 March, 2017)
We then conducted an experiment in three groups where 42 undergraduate students from a software engineering course were commissioned to operate the software tool after using a tutorial: the first group was provided only with the video tutorial, the second group only with the text tutorial and the third group with both. The data is available at [12]
Link: https://arxiv.org/abs/1704.00074
====================================================
Learning Visual Servoing with Deep Features and Fitted Q-Iteration (Alex X. Lee - 10 July, 2017)
Standard visual servoing approaches typically rely on manually designed features and analytical dynamics models, which limits their generalization capability and often requires extensive application-specific feature and model engineering. We show that we can learn an effective visual servo on a complex synthetic car following benchmark using just 20 training trajectory samples for reinforcement learning
Link: https://arxiv.org/abs/1703.11000
====================================================
DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling (Lachlan Tychsen-Smith - 20 July, 2017)
This methodology extends and formalizes previous state-of-the-art detection models with an additional emphasis on high evaluation rates and reduced manual engineering. The resulting model is scene adaptive, does not require manually defined reference bounding boxes and produces highly competitive results on MSCOCO, Pascal VOC 2007 and Pascal VOC 2012 with real-time evaluation rates
Link: https://arxiv.org/abs/1703.10295
====================================================
Bootstrapping a Lexicon for Emotional Arousal in Software Engineering (Mika V. Mäntylä - 27 March, 2017)
We present the first version of a Software Engineering Arousal lexicon (SEA) that is specifically designed to address the problem of emotional arousal in the software developer ecosystem. The best performance is obtained by combining SEA (428 words) with a previously created general purpose lexicon by Warriner et al. (13,915 words) and it achieves Cohen's d effect sizes up to 0.5.
Link: https://arxiv.org/abs/1703.09046
====================================================
Extracting Build Changes with BUILDDIFF (Christian Macho - 24 March, 2017)
Build systems are an essential part of modern software engineering projects. In this paper, we present BUILDDIFF, an approach to extract detailed build changes from MAVEN build files and classify them into 95 change types. In a manual evaluation of 400 build changing commits, we show that BUILDDIFF can extract and classify build changes with an average precision and recall of 0.96 and 0.98, respectively. We then present two studies using the build changes extracted from 30 open source Java projects to study the frequency and time of build changes. The results show that the top 10 most frequent change types account for 73% of the build changes
Link: https://arxiv.org/abs/1703.08527
====================================================
Requirements Engineering Practice and Problems in Agile Projects: Results from an International Survey (Stefan Wagner - 24 March, 2017)
As part of a bigger survey initiative (Naming the Pain in Requirements Engineering), we build an empirical basis on such aspects of agile RE. Based on the responses of representatives from 92 different organisations, we found that agile RE concentrates on free-text documentation of requirements elicited with a variety of techniques
Link: https://arxiv.org/abs/1703.08360
====================================================
Supervised Typing of Big Graphs using Semantic Embeddings (Mayank Kejriwal - 22 March, 2017)
It does not require any manual feature engineering, generalizes well to hundreds of types and achieves near-linear scaling on Big Graphs containing many millions of triples and instances by virtue of an incremental execution. We demonstrate the utility of the embeddings on a type recommendation task, outperforming a non-parametric feature-agnostic baseline while achieving 15x speedup and near-constant memory usage on a full partition of DBpedia. Finally, we use the embeddings to probabilistically cluster about 4 million DBpedia instances into 415 types in the DBpedia ontology.
Link: https://arxiv.org/abs/1703.07805
====================================================
Proceedings International Workshop on Formal Engineering approaches to Software Components and Architectures (Jan KofroÅ - 20 March, 2017)
These are the proceedings of the 14th International Workshop on Formal Engineering approaches to Software Components and Architectures (FESCA). The workshop was held on April 22, 2017 in Uppsala (Sweden) as a satellite event to the European Joint Conference on Theory and Practice of Software (ETAPS'17).
Link: https://arxiv.org/abs/1703.06590
====================================================
Systematic Mapping Study of Template-based Code Generation (Eugene Syriani - 18 March, 2017)
TBCG is a popular technique in model-driven engineering (MDE) given that they both emphasize abstraction and automation. Our study shows that the community has been diversely using TBCG over the past 15 years
Link: https://arxiv.org/abs/1703.06353
====================================================
Machine learning approach for early detection of autism by combining questionnaire and home video screening (Halim Abbas - 15 March, 2017)
To overcome the scarcity, sparsity, and imbalance of training data, we apply creative feature selection, feature engineering, and novel feature encoding techniques. We demonstrate a significant accuracy improvement over standard screening tools in a clinical study sample of 162 children.
Link: https://arxiv.org/abs/1703.06076
====================================================
On the Unhappiness of Software Developers (Daniel Graziotin - 10 May, 2017)
Recent research in software engineering supports the thesis, and the ideal of flourishing happiness among software developers is often expressed among industry practitioners. We conducted a large-scale quantitative and qualitative survey, incorporating a psychometrically validated instrument for measuring (un)happiness, with 2220 developers, yielding a rich and balanced sample of 1318 complete responses. We also identified 219 factors representing causes of unhappiness while developing software
Link: https://arxiv.org/abs/1703.04993
====================================================
Evaluation of 50 Greek Science and Engineering University Departments using Google Scholar (Marina Pitsolanti - 27 July, 2017)
In this paper, the scientometric evaluation of faculty members of 50 Greek Science and Engineering University Departments is presented. 1978 academics were examined in total
Link: https://arxiv.org/abs/1703.04478
====================================================
Building automated vandalism detection tools for Wikidata (Amir Sarabadani - 10 March, 2017)
This work is novel in that identifying damaging changes in a structured knowledge-base requires substantially different feature engineering work than in a text-based wiki like Wikipedia. We describe a machine classification strategy that is able to catch 89% of vandalism while reducing patrollers' workload by 98%, by drawing lightly from contextual features of an edit and heavily from the characteristics of the user making the edit.
Link: https://arxiv.org/abs/1703.03861
====================================================
On the Presence of Green and Sustainable Software Engineering in Higher Education Curricula (Damiano Torre - 3 March, 2017)
To this end, we report the findings from a targeted survey of 33 academics on the presence of green and sustainable software engineering in higher education
Link: https://arxiv.org/abs/1703.01078
====================================================
eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys (Joshua Saxe - 27 February, 2017)
Unfortunately, this vision hasn't come to fruition: in fact, developing and maintaining today's security machine learning systems can require engineering resources that are comparable to that of signature-based detection systems, due in part to the need to develop and continuously tune the "features" these machine learning systems look at as attacks evolve. In addition to completely automating the feature design and extraction process, eXpose outperforms manual feature extraction based baselines on all of the intrusion detection problems we tested it on, yielding a 5%-10% detection rate gain at 0.1% false positive rate compared to these baselines.
Link: https://arxiv.org/abs/1702.08568
====================================================
Deep Voice: Real-time Neural Text-to-Speech (Sercan O. Arik - 7 March, 2017)
By using a neural network for each component, our system is simpler and more flexible than traditional text-to-speech systems, where each component requires laborious feature engineering and extensive domain expertise. Finally, we show that inference with our system can be performed faster than real time and describe optimized WaveNet inference kernels on both CPU and GPU that achieve up to 400x speedups over existing implementations.
Link: https://arxiv.org/abs/1702.07825
====================================================
Flipping a Graduate-Level Software Engineering Foundations Course (Hakan Erdogmus - 22 February, 2017)
Creating a graduate-level software engineering breadth course is challenging. The course has been offered since Fall 2014 in the Silicon Valley campus
Link: https://arxiv.org/abs/1702.07069
====================================================
Shannon-inspired Statistical Computing to Enable Spintronics (Ameya D. Patil - 19 February, 2017)
This extraordinary result allowing a $10^{13}$ fold relaxation in acceptable error rates is obtained by engineering the error distribution coupled with statistical error compensation.
Link: https://arxiv.org/abs/1702.06119
====================================================
On the Unambiguous Distance of Multi-Carrier Phase Ranging with Random Hopped Frequencie (Peng Liu - 18 February, 2017)
In this paper, we try to find a deterministic value to depict the UD of an MPR system under the random spaced frequencies (RSF) configuration, serving as a metric of its measurable distance in engineering applications. Alternatively, we propose to adopt the upper bound of the random UD as the metric, because when the RSF set contains more than a dozen of carriers, i) we prove the probability that the random UD obtains its upper bound is very close to 1 if phase noise is not introduced; ii) simulations show that the upper bound can also be obtained reliably in the presence of phase noise
Link: https://arxiv.org/abs/1702.05616
====================================================
A Concurrent Perspective on Smart Contracts (Ilya Sergey - 17 February, 2017)
The described contracts-as-concurrent-objects analogy provides deeper understanding of potential threats for smart contracts, indicate better engineering practices, and enable applications of existing state-of-the-art formal verification techniques.
Link: https://arxiv.org/abs/1702.05511
====================================================
How Much Does Users' Psychology Matter in Engineering Mean-Field-Type Games (Giulia Rossi - 25 February, 2017)
Basic empathy concepts are illustrated in several important problems in engineering including resource sharing, packet collision minimization, energy markets, and forwarding in Device-to-Device communications. The work conducts also an experiment with 47 people who have to decide whether to cooperate or not
Link: https://arxiv.org/abs/1702.05355
====================================================
Mining Behavioral Patterns from Millions of Android Users (Xuanzhe Liu - 22 March, 2017)
Supporting mobility has become a promising trend in software engineering research. The dataset of Wandoujia service profiles consists of two kinds of user behavioral data from using 0.28 million free Android apps, including (1) app management activities (i.e., downloading, updating, and uninstalling apps) from over 17 million unique users and (2) app network usage from over 6 million unique users
Link: https://arxiv.org/abs/1702.05060
====================================================
Small Boxes Big Data: A Deep Learning Approach to Optimize Variable Sized Bin Packing (Feng Mao - 14 February, 2017)
We show in this paper how to build such a system by both theoretical formulation and engineering practices. Our prediction system achieves up to 89% training accuracy and 72% validation accuracy to select the best heuristic that can generate a better quality bin packing solution.
Link: https://arxiv.org/abs/1702.04415
====================================================
Supporting Defect Causal Analysis in Practice with Cross-Company Data on Causes of Requirements Engineering Problems (Marcos Kalinowski - 13 February, 2017)
[Method] We collected cross-company data on causes of requirements engineering problems from 74 Brazilian organizations and built a Bayesian network
Link: https://arxiv.org/abs/1702.03851
====================================================
Multitask Learning with Deep Neural Networks for Community Question Answering (Daniele Bonadiman - 13 February, 2017)
Additionally, our method, which does not use any manual feature engineering, approaches the state of the art established with methods that make heavy use of it.
Link: https://arxiv.org/abs/1702.03706
====================================================
A Morphology-aware Network for Morphological Disambiguation (Eray Yildiz - 13 February, 2017)
In this work, while we focus on Turkish morphological disambiguation we also present results for French and German in order to show that the proposed architecture achieves high accuracy with no language-specific feature engineering or additional resource. In the experiments, we achieve 84.12, 88.35 and 93.78 morphological disambiguation accuracy among the ambiguous words for Turkish, German and French respectively.
Link: https://arxiv.org/abs/1702.03654
====================================================
A Deep Convolutional Neural Network for Background Subtraction (Mohammadreza Babaee - 6 February, 2017)
With this approach, feature engineering and parameter tuning become unnecessary since the network parameters can be learned from data by training a single CNN that can handle various video scenes. For the training of the CNN, we employed randomly 5 percent video frames and their ground truth segmentations taken from the Change Detection challenge 2014(CDnet 2014)
Link: https://arxiv.org/abs/1702.01731
====================================================
Beyond Evolutionary Algorithms for Search-based Software Engineering (Jianfeng Chen - 17 September, 2017)
Context: Evolutionary algorithms typically require a large number of evaluations (of solutions) to converge - which can be very slow and expensive to evaluate.Objective: To solve search-based software engineering (SE) problems, using fewer evaluations than evolutionary methods.Method: Instead of mutating a small population, we build a very large initial population which is then culled using a recursive bi-clustering chop approach. Results: Using just a few evaluations (under 100), we can obtain comparable results to state-of-the-art evolutionary algorithms.Conclusion: Just because something works, and is widespread use, does not necessarily mean that there is no value in seeking methods to improve that method
Link: https://arxiv.org/abs/1701.07950
====================================================
The Influence of Teamwork Quality on Software Team Performance (Emily Weimar - 22 January, 2017)
Traditionally, software quality is thought to depend on sound software engineering and development methodologies such as structured programming and agile development. Since the success rate of software development projects is low (Wateridge, 1995; The Standish Group, 2009), it is important to understand which characteristics of interactions within software development teams significantly influence performance. The relationship between TWQ and team performance and the improvement of the model are tested using data from 252 team members and stakeholders. Results show that teamwork quality is significantly related to team performance, as rated by both team members and stakeholders: TWQ explains 81% of the variance of team performance as rated by team members and 61% as rated by stakeholders
Link: https://arxiv.org/abs/1701.06146
====================================================
Applying empirical software engineering to software architecture: challenges and lessons learned (Davide Falessi - 21 January, 2017)
In the last 15 years, software architecture has emerged as an important software engineering field for managing the development and maintenance of large, software- intensive systems
Link: https://arxiv.org/abs/1701.06000
====================================================
Towards the Assessment of Stress and Emotional Responses of a Salutogenesis-Enhanced Software Tool Using Psychophysiological Measurements (Jan-Peter Ostberg - 22 February, 2017)
While studies of affect are emerging and rising in software engineering research, stress has yet to find its place in the literature despite that it is highly related to affect. We propose a controlled experiment for testing our hypotheses that a static analysis tool enhanced with the Salutogenesis model will bring 1) a higher number of fixed quality issues, 2) reduced cognitive load, 3) reduction of the overall stress, and 4) positive affect induction effects to developers
Link: https://arxiv.org/abs/1701.05739
====================================================
Software Architectures for Robotics Systems: A Systematic Mapping Study (Aakash Ahmad - 19 January, 2017)
The reported solutions have exploited model-driven, service oriented and reverse engineering techniques since 2005
Link: https://arxiv.org/abs/1701.05453
====================================================
Unhappy Developers: Bad for Themselves, Bad for Process, and Bad for Software Product (Daniel Graziotin - 10 February, 2017)
Recent research in software engineering supports the "happy-productive" thesis, and the desire of flourishing happiness among programmers is often expressed by industry practitioners. Using qualitative data analysis of the survey responses given by 181 participants, we identified 49 potential consequences of unhappiness while developing software
Link: https://arxiv.org/abs/1701.02952
====================================================
Exploration of Proximity Heuristics in Length Normalization (Pranav Agrawal - 5 January, 2017)
The paper prescribes a specific case of a generalized function for recommendation system using feature engineering guidelines on the given data set. The proximity feature based ranking function has outperformed by 52% from regular BM25.
Link: https://arxiv.org/abs/1701.01417
====================================================
Fuzzy Based Implicit Sentiment Analysis on Quantitative Sentences (Amir Hossein Yazdavar - 3 January, 2017)